**Isil Dillig Serdar Tasiran (Eds.)**

# LNCS 11561

# **Computer Aided Verification**

**31st International Conference, CAV 2019 New York City, NY, USA, July 15–18, 2019 Proceedings, Part I**

### Lecture Notes in Computer Science 11561

#### Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

#### Editorial Board Members

David Hutchison Lancaster University, Lancaster, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Friedemann Mattern ETH Zurich, Zurich, Switzerland John C. Mitchell Stanford University, Stanford, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel C. Pandu Rangan Indian Institute of Technology Madras, Chennai, India Bernhard Steffen TU Dortmund University, Dortmund, Germany Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA

More information about this series at http://www.springer.com/series/7407

Isil Dillig • Serdar Tasiran (Eds.)

## Computer Aided Verification

31st International Conference, CAV 2019 New York City, NY, USA, July 15–18, 2019 Proceedings, Part I

Editors Isil Dillig University of Texas Austin, TX, USA

Serdar Tasiran Amazon Web Services New York, NY, USA

ISSN 0302-9743 ISSN 1611-3349 (electronic) Lecture Notes in Computer Science ISBN 978-3-030-25539-8 ISBN 978-3-030-25540-4 (eBook) https://doi.org/10.1007/978-3-030-25540-4

LNCS Sublibrary: SL1 – Theoretical Computer Science and General Issues

© The Editor(s) (if applicable) and The Author(s) 2019. This book is an open access publication.

Open Access This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

#### Preface

It was our privilege to serve as the program chairs for CAV 2019, the 31st International Conference on Computer-Aided Verification. CAV 2019 was held in New York, USA, during July 15–18, 2019. The tutorial day was on July 14, 2019, and the pre-conference workshops were held during July 13–14, 2019. All events took place in The New School in New York City.

CAV is an annual conference dedicated to the advancement of the theory and practice of computer-aided formal analysis methods for hardware and software systems. The primary focus of CAV is to extend the frontiers of verification techniques by expanding to new domains such as security, quantum computing, and machine learning. This put CAV at the cutting edge of formal methods research, and this year's program is a reflection of this commitment.

CAV 2019 received a very high number of submissions (258). We accepted 13 tool papers, two case studies, and 52 regular papers, which amounts to an acceptance rate of roughly 26%. The accepted papers cover a wide spectrum of topics, from theoretical results to applications of formal methods. These papers apply or extend formal methods to a wide range of domains such as concurrency, learning, and industrially deployed systems. The program featured invited talks by Dawn Song (UC Berkeley), Swarat Chaudhuri (Rice University), and Ken McMillan (Microsoft Research) as well as invited tutorials by Emina Torlak (University of Washington) and Ranjit Jhala (UC San Diego). Furthermore, we continued the tradition of Logic Lounge, a series of discussions on computer science topics targeting a general audience.

In addition to the main conference, CAV 2019 hosted the following workshops: The Best of Model Checking (BeMC) in honor of Orna Grumberg, Design and Analysis of Robust Systems (DARS), Verification Mentoring Workshop (VMW), Numerical Software Verification (NSV), Verified Software: Theories, Tools, and Experiments (VSTTE), Democratizing Software Verification, Formal Methods for ML-Enabled Autonomous Systems (FoMLAS), and Synthesis (SYNT).

Organizing a top conference like CAV requires a great deal of effort from the community. The Program Committee for CAV 2019 consisted of 79 members, a committee of this size ensures that each member has to review a reasonable number of papers in the allotted time. In all, the committee members wrote over 770 reviews while investing significant effort to maintain and ensure the high quality of the conference program. We are grateful to the CAV 2019 Program Committee for their outstanding efforts in evaluating the submissions and making sure that each paper got a fair chance.

Like last year's CAV, we made artifact evaluation mandatory for tool submissions and optional but encouraged for the rest of the accepted papers. The Artifact Evaluation Committee consisted of 27 reviewers who put in significant effort to evaluate each artifact. The goal of this process was to provide constructive feedback to tool developers and help make the research published in CAV more reproducible. The Artifact Evaluation Committee was generally quite impressed by the quality of the artifacts, and, in fact, all accepted tools passed the artifact evaluation. Among regular papers, 65% of the authors submitted an artifact, and 76% of these artifacts passed the evaluation. We are also very grateful to the Artifact Evaluation Committee for their hard work and dedication in evaluating the submitted artifacts.

CAV 2019 would not have been possible without the tremendous help we received from several individuals, and we would like to thank everyone who helped make CAV 2019 a success. First, we would like to thank Yu Feng and Ruben Martins for chairing the Artifact Evaluation Committee and Zvonimir Rakamaric for maintaining the CAV website and social media presence. We also thank Oksana Tkachuk for chairing the workshop organization process, Peter O'Hearn for managing sponsorship, and Thomas Wies for arranging student fellowships. We also thank Loris D'Antoni, Rayna Dimitrova, Cezara Dragoi, and Anthony W. Lin for organizing the Verification Mentoring Workshop and working closely with us. Last but not least, we would like to thank Kostas Ferles, Navid Yaghmazadeh, and members of the CAV Steering Committee (Ken McMillan, Aarti Gupta, Orna Grumberg, and Daniel Kroening) for helping us with several important aspects of organizing CAV 2019.

We hope that you will find the proceedings of CAV 2019 scientifically interesting and thought-provoking!

June 2019 Isil Dillig Serdar Tasiran

### Organization

#### Program Chairs



#### Program Committee


Marc Brockschmidt Microsoft, UK Pavol Cerny University of Colorado Boulder, USA Swarat Chaudhuri Rice University, USA Wei-Ngan Chin National University of Singapore Adam Chlipala Massachusetts Institute of Technology, USA Hana Chockler King's College London, UK Eva Darulova Max Planck Institute for Software Systems, Germany Cristina David University of Cambridge, UK Dana Drachsler Cohen ETH Zurich, Switzerland Cezara Dragoi Inria Paris, ENS, France Constantin Enea IRIF, University of Paris Diderot, France Azadeh Farzan University of Toronto, Canada Grigory Fedyukovich Princeton University, USA Yu Feng University of California, Santa Barbara, USA Dana Fisman Ben-Gurion University, Israel Milos Gligoric The University of Texas at Austin, USA Patrice Godefroid Microsoft, USA Laure Gonnord University of Lyon/Laboratoire d'Informatique du Parallélisme, France Aarti Gupta Princeton University, USA Arie Gurfinkel University of Waterloo, Canada Klaus Havelund Jet Propulsion Laboratory, USA Chris Hawblitzel Microsoft, USA Alan J. Hu The University of British Columbia, Canada Shachar Itzhaky Technion, Israel Franjo Ivancic Google, USA Ranjit Jhala University of California San Diego, USA Rajeev Joshi Automated Reasoning Group, Amazon Web Services, USA Dejan Jovanović SRI International, USA Laura Kovacs Vienna University of Technology, Austria Burcu Kulahcioglu Ozkan MPI-SWS, Germany Marta Kwiatkowska University of Oxford, UK Shuvendu Lahiri Microsoft, USA Akash Lal Microsoft, India Stephen Magill Galois, Inc., USA Joao Marques-Silva Universidade de Lisboa, Portugal Ruben Martins Carnegie Mellon University, USA Ken McMillan Microsoft, USA Vijay Murali Facebook, USA Peter Müller ETH Zurich, Switzerland Mayur Naik Intel, USA Hakjoo Oh Korea University, South Korea Oded Padon Stanford University, USA Corina Pasareanu CMU/NASA Ames Research Center, USA Ruzica Piskac Yale University, USA


#### Artifact Evaluation Committee



#### Mentoring Workshop Organizing Committee


#### Steering Committee


#### Additional Reviewers

Sepideh Asadi Lucas Asadi Haniel Barbosa Ezio Bartocci Sam Bartocci Suda Bharadwaj Erdem Biyik Martin Biyik Timothy Bourke Julien Braine Steven Braine Benjamin Caulfield Eti Chaudhary Xiaohong Chaudhary Yinfang Chen Andreea Costea Murat Costea

Emanuele D'Osualdo Nicolas Dilley Marko Dilley Bruno Dutertre Marco Eilers Cindy Eilers Yotam Feldman Jerome Feret Daniel Feret Mahsa Ghasemi Shromona Ghosh Anthony Ghosh Bernhard Gleiss Shilpi Goel William Goel Mirazul Haque Ludovic Henrio

Andreas Henrio Antti Hyv ärinen Duligur Ibeling Rinat Ibeling Nouraldin Jaber Swen Jacobs Maximilian Jacobs Susmit Jha Anja Karl Jens Karl Sean Kauffman Ayrat Khalimov Bettina Khalimov Hillel Kugler Daniel Larraz Christopher Larraz Wonyeol Lee Matt Lewis Wenchao Lewis Kaushik Mallik Matteo Marescotti David Marescotti Dmitry Mordvinov Matthieu Moy Thanh Toan Moy Victor Nicolet Andres Noetzli Abraham Noetzli Saswat Padhi Karl Palmskog

Rong Palmskog Daejun Park Brandon Paulsen Lucas Paulsen Adi Yoga Prabawa Dhananjay Raju Andrew Raju Heinz Riener Sriram Sankaranarayanan Mark Sankaranarayanan Yagiz Savas Traian Florin Serbanuta Fu Serbanuta Yahui Song Pramod Subramanyan Rob Subramanyan Sol Swords Martin Tappler Ta Quang Tappler Anthony Vandikas Marcell Vazquex-Chanlatte Yuke Vazquex-Chanlatte Min Wen Josef Widder Bo Widder Haoze Wu Zhe Xu May Xu Yi Zhang Zhizhou Zhang

### Contents – Part I

#### Automata and Timed Systems


#### Synthesis


#### Model Checking


#### Cyber-Physical Systems and Machine Learning


#### Dynamical, Hybrid, and Reactive Systems


### Contents – Part II

#### Logics, Decision Procedures, and Solvers


#### Verification


#### Verification and Invariants


Andrei Damian, Cezara Drăgoi, Alexandru Militaru, and Josef Widder


### Automata and Timed Systems

### **Symbolic Register Automata**

Loris D'Antoni<sup>1</sup>, Tiago Ferreira<sup>2</sup>, Matteo Sammartino2(B), and Alexandra Silva<sup>2</sup>

<sup>1</sup> University of Wisconsin–Madison, Madison, WI 53706-1685, USA loris@cs.wisc.edu

<sup>2</sup> University College London, Gower Street, London WC1E 6BT, UK me@tiferrei.com, {m.sammartino,a.silva}@ucl.ac.uk

**Abstract.** Symbolic Finite Automata and Register Automata are two orthogonal extensions of finite automata motivated by real-world problems where data may have unbounded domains. These automata address a demand for a model over large or infinite alphabets, respectively. Both automata models have interesting applications and have been successful in their own right. In this paper, we introduce Symbolic Register Automata, a new model that combines features from both symbolic and register automata, with a view on applications that were previously out of reach. We study their properties and provide algorithms for emptiness, inclusion and equivalence checking, together with experimental results.

#### **1 Introduction**

Finite automata are a ubiquitous formalism that is simple enough to model many real-life systems and phenomena. They enjoy a large variety of theoretical properties that in turn play a role in practical applications. For example, finite automata are closed under Boolean operations, and have decidable emptiness and equivalence checking procedures. Unfortunately, finite automata have a fundamental limitation: they can only operate over finite (and typically small) alphabets. Two *orthogonal* families of automata models have been proposed to overcome this: *symbolic automata* and *register automata*. In this paper, we show that these two models can be combined yielding a new powerful model that can cover interesting applications previously out of reach for existing models.

Symbolic finite automata (SFAs) allow transitions to carry predicates over rich first-order alphabet theories, such as linear arithmetic, and therefore extend classic automata to operate over infinite alphabets [12]. For example, an SFA can define the language of all lists of integers in which the first and last elements are positive integer numbers. Despite their increased expressiveness, SFAs enjoy the same closure and decidability properties of finite automata—e.g., closure under Boolean operations and decidable equivalence and emptiness.

This work was partially funded by NSF Grants CCF-1763871, CCF-1750965, a Facebook TAV Research Award, the ERC starting grant Profoundnet (679127) and a Leverhulme Prize (PLP-2016-129). See [10] for the full version of this paper.

Register automata (RA) support infinite alphabets by allowing input characters to be stored in registers during the computation and to be compared against existing values that are already stored in the registers [17]. For example, an RA can define the language of all lists of integers in which all numbers appearing in even positions are the same. RAs do not have some of the properties of finite automata (e.g., they cannot be determinized), but they still enjoy many useful properties that have made them a popular model in static analysis, software verification, and program monitoring [15].

In this paper, we combine the best features of these two models—first order alphabet theories and registers—into a new model, *symbolic register automata* (SRA). SRAs are strictly more expressive than SFAs and RAs. For example, an SRA can define the language of all lists of integers in which the first and last elements are positive rational numbers and all numbers appearing in even positions are the same. This language is not recognizable by either an SFA nor by an RA.

While other attempts at combining symbolic automata and registers have resulted in undecidable models with limited closure properties [11], we show that SRAs enjoy the same closure and decidability properties of (non-symbolic) register automata. We propose a new application enabled by SRAs and implement our model in an open-source automata library.

In summary, our contributions are:


#### **2 Motivating Example**

In this section, we illustrate the capabilities of symbolic register automata using a simple example. Consider the regular expression r<sup>p</sup> shown in Fig. 1a. This expression, given a sequence of product descriptions, checks whether the products have the same code and lot number. The reader might not be familiar with some of the unusual syntax of this expression. In particular, r<sup>p</sup> uses two backreferences \1 and \2. The semantics of this construct is that the string matched by the regular expression for \1 (resp. \2) should be exactly the string that matched the subregular expression r appearing between the first (resp. second)


**Fig. 1.** Regular expression for matching products with same code and lot number—i.e., the characters of C and L are the same in all the products.

two parenthesis, in this case (.{3}) (resp. (.)). Back-references allow regular expressions to check whether the encountered text is the same or is different from a string/character that appeared earlier in the input (see Figs. 1b and c for examples of positive and negative matches).

Representing this complex regular expression using an automaton model requires addressing several challenges. The expression rp:


Existing automata models do not address one or more of these challenges. Finite automata require one transition for each character in the input alphabet and blow-up when representing large alphabets. Symbolic finite automata (SFA) allow transitions to carry predicates over rich structured first-order alphabet theories and can describe, for example, character classes [12]. However, SFAs cannot directly check whether a character or a string is repeated in the input. An SFA for describing the regular expression r<sup>p</sup> would have to store the characters after C: directly in the states to later check whether they match the ones of the second product. Hence, the smallest SFA for this example would require billions of states! Register automata (RA) and their variants can store characters in registers during the computation and compare characters against values already stored in the registers [17]. Hence, RAs can check whether the two products have the same code. However, RAs only operate over unstructured infinite alphabets and cannot check, for example, that a character belongs to a given class.

The model we propose in this paper, *symbolic register automata* (SRA), combines the best features of SFAs and RAs—first-order alphabet theories and registers—and can address all the three aforementioned challenges. Figure 1d shows a snippet of a symbolic register automaton A<sup>p</sup> corresponding to rp. Each transition in A<sup>p</sup> is labeled with a predicate that describes what characters can trigger the transition. For example, ^\s denotes that the transition can be triggered by any non-space character, L denotes that the transition can be triggered by the character L, and true denotes that the transition can be triggered by any character. Transitions of the form ϕ/→r<sup>i</sup> denote that, if a character x satisfies the predicate ϕ, the character is then stored in the register ri. For example, the transition out of state 1 reads any character and stores it in register r1. Finally, transitions of the form ϕ/= r<sup>i</sup> are triggered if a character x satisfies the predicate ϕ and x is the same character as the one stored in ri. For example, the transition out of state 2 can only be triggered by the same character that was stored in r<sup>1</sup> when reading the transition out state 1—i.e., the first characters in the product codes should be the same.

SRAs are a natural model for describing regular expressions like rp, where capture groups are of bounded length, and hence correspond to finitely-many registers. The SRA A<sup>p</sup> has fewer than 50 states (vs. more than 100 billion for SFAs) and can, for example, be used to check whether an input string matches the given regular expression (e.g., monitoring). More interestingly, in this paper we study the closure and decidability properties of SRAs and provide an implementation for our model. For example, consider the following regular expression rpC that only checks whether the product codes are the same, but not the lot numbers:

$$
\mathbb{C} \colon \{\square\} \nrightarrow \mathbb{L} \colon \mathbb{L} \colon \mathbb{L} \colon \mathbb{L} \colon \mathbb{C} \colon \mathbb{L} \colon \mathbb{L} \colon \mathbb{L} \colon +\infty
$$

The set of strings accepted by rpC is a superset of the set of strings accepted by rp. In this paper, we present simulation and bisimulation algorithms that can check this property. Our implementation can show that r<sup>p</sup> subsumes rpC in 25 s and we could not find other tools that can prove the same property.

#### **3 Symbolic Register Automata**

In this section we introduce some preliminary notions, we define symbolic register automata and a variant that will be useful in proving decidability properties.

**Preliminaries.** An *effective Boolean algebra* A is a tuple (D,Ψ, - , ⊥, ,∧,∨,¬), where: D is a set of domain elements; Ψ is a set of predicates closed under the Boolean connectives and ⊥, ∈ Ψ. The denotation function - : <sup>Ψ</sup> <sup>→</sup> <sup>2</sup><sup>D</sup> is such that -⊥ = ∅ and - = D, for all ϕ, ψ ∈ Ψ, ϕ ∨ ψ = ϕ ∪ ψ, ϕ ∧ ψ = ϕ ∩ ψ, and -¬ϕ = D \ ϕ. For ϕ ∈ Ψ, we write isSat(ϕ) whenever ϕ = ∅ and say that ϕ is *satisfiable*. A is *decidable* if isSat is decidable. For each a ∈ D, we assume predicates atom(a) such that atom(a) = {a}.

*Example 1.* The theory of linear integer arithmetic forms an effective BA, where D = Z and Ψ contains formulas ϕ(x) in the theory with one fixed integer variable. For example, div<sup>k</sup> := (x mod k) = 0 denotes the set of all integers divisible by k. **Notation.** Given a set S, we write P(S) for its powerset. Given a function f : A → B, we write f[a → b] for the function such that f[a → b](a) = b and f[a → b](x) = f(x), for x = a. Analogously, we write f[S → b], with S ⊆ A, to map multiple values to the same b. The *pre-image* of f is the function <sup>f</sup> <sup>−</sup><sup>1</sup> : <sup>P</sup>(B) <sup>→</sup> <sup>P</sup>(A) given by <sup>f</sup> <sup>−</sup><sup>1</sup>(S) = {<sup>a</sup> | ∃<sup>b</sup> <sup>∈</sup> <sup>S</sup> : <sup>b</sup> <sup>=</sup> <sup>f</sup>(a)}; for readability, we will write <sup>f</sup> <sup>−</sup><sup>1</sup>(x) when <sup>S</sup> <sup>=</sup> {x}. Given a relation <sup>R</sup> <sup>⊆</sup> <sup>A</sup> <sup>×</sup> <sup>B</sup>, we write <sup>a</sup>R<sup>b</sup> for (a, b) ∈ R.

**Model Definition.** Symbolic register automata have transitions of the form:

$$p \xrightarrow{\varphi/E, I, U} q$$

where p and q are states, ϕ is a predicate from a fixed effective Boolean algebra, and E,I,U are subsets of a fixed finite set of registers R. The intended interpretation of the above transition is: an input character a can be read in state q if (i) a ∈ ϕ, (ii) the content of all the registers in E is *equal* to a, and (iii) the content of all the registers in I is *different* from a. If the transition succeeds then a is stored into all the registers U and the automaton moves to q.

*Example 2.* The transition labels in Fig. 1d have been conveniently simplified to ease intuition. These labels correspond to full SRA labels as follows:

$$\varphi/\lnot r \implies \varphi/\emptyset, \emptyset, \{r\} \qquad \varphi/\!= r \implies \varphi/\{r\}, \emptyset, \emptyset \qquad \varphi \implies \varphi/\emptyset, \emptyset, \emptyset \dots$$

Given a set of registers R, the transitions of an SRA have labels over the following set: L<sup>R</sup> = Ψ × {(E,I,U) ∈ P(R) × P(R) × P(R) | E ∩ I = ∅}. The condition E ∩ I = ∅ guarantees that register constraints are always satisfiable.

**Definition 1 (Symbolic Register Automaton).** *A* symbolic register automaton *(SRA) is a 6-tuple* (R, Q, q0, v0, F,Δ)*, where* R *is a finite set of* registers*,* Q *is a finite set of* states*,* q<sup>0</sup> ∈ Q *is the* initial *state,* v<sup>0</sup> : R → D ∪ {} *is the* initial register assignment *(if* v0(r) = *, the register* r *is considered* empty*),* F ⊆ Q *is a finite set of* final *states, and* Δ ⊆ Q × L<sup>R</sup> × Q *is the* transition relation*. Transitions* (p,(ϕ, ), q) ∈ Δ *will be written as* p ϕ/ −−→ q*.*

An SRA can be seen as a finite description of a (possibly infinite) labeled transition system (LTS), where states have been assigned concrete register values, and transitions read a single symbol from the potentially infinite alphabet. This so-called *configuration LTS* will be used in defining the semantics of SRAs.

**Definition 2 (Configuration LTS).** *Given an SRA* S*, the* configuration *LTS* CLTS(S) *is defined as follows. A* configuration *is a pair* (p, v) *where* p ∈ Q *is a state in* S *and a* v : R → D ∪ {} *is* register assignment*;* (q0, v0) *is called the* initial configuration*; every* (q, v) *such that* q ∈ F *is a* final *configuration. The set of transitions between configurations is defined as follows:*

$$\frac{p \xrightarrow{\varphi/E, I, U} q \in \Delta \qquad E \subseteq v^{-1}(a) \quad I \cap v^{-1}(a) = \emptyset}{(p, v) \xrightarrow{a} (q, v[U \mapsto a]) \in \mathsf{CLTS}(\mathbb{S})}$$

Intuitively, the rule says that a SRA transition from p can be instantiated to one from (p, v) that reads a when the registers containing the value a, namely v−<sup>1</sup>(a), satisfy the constraint described by E,I (a is contained in registers E but not in I). If the constraint is satisfied, all registers in U are assigned a.

A *run* of the SRA S is a sequence of transitions in CLTS(S) starting from the initial configuration. A configuration is *reachable* whenever there is a run ending up in that configuration. The *language* of an SRA S is defined as

$$\mathcal{QC}(\mathbb{S}) := \{ a\_1 \ldots a\_n \in \mathcal{D}^n \mid \exists (q\_0, v\_0) \xrightarrow{a\_1} \ldots \xrightarrow{a\_n} (q\_n, v\_n) \in \mathsf{CLTS}(\mathbb{S}), q\_n \in F \},$$

An SRA S is *deterministic* if its configuration LTS is; namely, for every word <sup>w</sup> <sup>∈</sup> <sup>D</sup> there is at most one run in CLTS(S) spelling <sup>w</sup>. Determinism is important for some application contexts, e.g., for runtime monitoring. Since SRAs subsume RAs, nondeterministic SRAs are strictly more expressive than deterministic ones, and language equivalence is undecidable for nondeterministic SRAs [27].

We now introduce the notions of *simulation* and *bisimulation* for SRAs, which capture whether one SRA behaves "at least as" or "exactly as" another one.

**Definition 3 ((Bi)simulation for SRAs).** *A simulation* R *on SRAs* S<sup>1</sup> *and* S<sup>2</sup> *is a binary relation* R *on configurations such that* (p1, v1)R(p2, v2) *implies:*


*A simulation* <sup>R</sup> *is a* bisimulation *if* <sup>R</sup>−<sup>1</sup> *is a also a simulation. We write* <sup>S</sup><sup>1</sup> <sup>≺</sup> <sup>S</sup><sup>2</sup> *(resp.* S<sup>1</sup> ∼ S2*) whenever there is a simulation (resp. bisimulation)* R *such that* (q01, v01)R(q02, v02)*, where* (q0i, v0i) *is the initial configuration of* Si*, for* i = 1, 2*.*

We say that an SRA is *complete* whenever for every configuration (p, v) and <sup>a</sup> <sup>∈</sup> <sup>D</sup> there is a transition (p, v) <sup>a</sup> −→ (q, w) in CLTS(S). The following results connect similarity and language inclusion.

**Proposition 1.** *If* S<sup>1</sup> ≺ S<sup>2</sup> *then L* (S1) ⊆ *L* (S2)*. If* S<sup>1</sup> *and* S<sup>2</sup> *are deterministic and complete, then the other direction also holds.*

It is worth noting that given a deterministic SRA we can define its *completion* by adding transitions so that every value a ∈ D can be read from any state.

*Remark 1.* RAs and SFAs can be encoded as SRAs on the same state-space:


SRAs are *strictly more expressive* than both RAs and SFAs. For instance, the language {n0n<sup>1</sup> ...n<sup>k</sup> <sup>|</sup> <sup>n</sup><sup>0</sup> <sup>=</sup> <sup>n</sup>k, even(ni), n<sup>i</sup> <sup>∈</sup> <sup>Z</sup>, i = 1,...,k} of finite sequences of even integers where the first and last one coincide, can be recognized by an SRA, but not by an RA or by an SFA.

**Boolean Closure Properties.** SRAs are closed under intersection and union. Intersection is given by a standard product construction whereas union is obtained by adding a new initial state that mimics the initial states of both automata.

**Proposition 2 (Closure under intersection and union).** *Given SRAs* S<sup>1</sup> *and* S2*, there are SRAs* S1∩S<sup>2</sup> *and* S1∪S<sup>2</sup> *such that L* (S1∩S2) = *L* (S1)∩*L* (S2) *and L* (S<sup>1</sup> ∪ S2) = *L* (S1) ∪ *L* (S2)*.*

SRAs in general are not closed under complementation, because RAs are not. However, we still have closure under complementation for a subclass of SRAs.

**Proposition 3.** *Let* S *be a complete and deterministic SRA, and let* S *be the SRA defined as* <sup>S</sup>*, except that its final states are* <sup>Q</sup>\F*. Then <sup>L</sup>* (S) = <sup>D</sup> \*<sup>L</sup>* (S)*.*

#### **4 Decidability Properties**

In this section we will provide algorithms for checking determinism and emptiness for an SRA, and (bi)similarity of two SRAs. Our algorithms leverage *symbolic* techniques that use the finite syntax of SRAs to indirectly operate over the underlying configuration LTS, which can be infinite.

**Single-Valued Variant.** To study decidability, it is convenient to restrict register assignments to *injective* ones on non-empty registers, that is functions v : R → D ∪ {} such that v(r) = v(s) and v(r) = implies r = s. This is also the approach taken for RAs in the seminal papers [17,27]. Both for RAs and SRAs, this restriction does not affect expressivity. We say that an SRA is *single-valued* if its initial assignment v<sup>0</sup> is injective on non-empty registers. For single-valued SRAs, we only allow two kinds of transitions:

**Read transition:** p ϕ/r<sup>=</sup> −−−→ q triggers when a ∈ ϕ and a is already stored in r. **Fresh transition:** p ϕ/r*•* −−−→ q triggers when the input a ∈ ϕ and a is *fresh*, i.e., is not stored in any register. After the transition, a is stored into r.

SRAs and their single-valued variants have the same expressive power. Translating single-valued SRAs to ordinary ones is straightforward:

$$p \xrightarrow{\varphi/r^{\bullet}} q \implies p \xrightarrow{\varphi/\{r\}, \emptyset, \emptyset} q \qquad \qquad p \xrightarrow{\varphi/r^{\bullet}} q \implies p \xrightarrow{\varphi/\emptyset, R, \{r\}} q$$

The opposite translation requires a state-space blow up, because we need to encode register equalities in the states.

**Theorem 1.** *Given an SRA* S *with* n *states and* r *registers, there is a singlevalued SRA* <sup>S</sup> *with* <sup>O</sup>(nr<sup>r</sup>) *states and* <sup>r</sup>+ 1 *registers such that* <sup>S</sup> <sup>∼</sup> <sup>S</sup> *. Moreover, the translation preserves determinism.*

**Normalization.** While our techniques are inspired by analogous ones for nonsymbolic RAs, SRAs present an additional challenge: they can have arbitrary predicates on transitions. Hence, the values that each transition can read, and thus which configurations it can reach, depend on the history of past transitions and their predicates. This problem emerges when checking reachability and similarity, because a transition may be *disabled* by particular register values, and so lead to unsound conclusions, a problem that does not exist in register automata.

*Example 3.* Consider the SRA below, defined over the BA of integers.

All predicates on transitions are satisfiable, yet *L* (S) = ∅. To go from 0 to 1, S must read a value n such that div3(n) and n = 0 and then n is stored into r. The transition from 1 to 2 can only happen if the content of r also satisfies div5(n) and n ∈ [0, 10]. However, there is no n satisfying div3(n)∧n = 0∧div5(n)∧n ∈ [0, 10], hence the transition from 1 to 2 never happens.

To handle the complexity caused by predicates, we introduce a way of *normalizing* an SRA to an equivalent one that *stores additional information about input predicates*. We first introduce some notation and terminology.

A register abstraction θ for S, used to "keep track" of the domain of registers, is a family of predicates indexed by the registers R of S. Given a register assignment v, we write v |= θ whenever v(r) ∈ θr for v(r) = , and θ<sup>r</sup> = ⊥ otherwise. Hereafter we shall only consider "meaningful" register abstractions, for which there is at least one assignment v such that v |= θ.

With the contextual information about register domains given by θ, we say that a transition p ϕ/ −−→ q ∈ Δ is *enabled by* θ whenever it has at least an instance (p, v) <sup>a</sup> −→ (q, w) in CLTS(S), for all v |= θ. Enabled transitions are important when reasoning about reachability and similarity.

Checking whether a transition has at least one realizable instance in the CLTS is difficult in practice, especially when = r•, because it amounts to checking whether ϕ \ img(v) = ∅, for all injective v |= θ.

To make the check for enabledness practical we will use minterms. For a set of predicates Φ, a *minterm* is a minimal satisfiable Boolean combination of all predicates that occur in Φ. Minterms are the analogue of atoms in a complete atomic Boolean algebra. E.g. the set of predicates Φ = {x > 2,x < 5} over the theory of linear integer arithmetic has minterms mint(Φ) = {x > 2∧x < 5, ¬x > 2 ∧ x < 5, x> 2 ∧ ¬x < 5}. Given ψ ∈ mint(Φ) and ϕ ∈ Φ, we will write ϕ ψ whenever ϕ appears non-negated in ψ, for instance (x > 2) - (x > 2 ∧ ¬x < 5). A crucial property of minterms is that they do not overlap, i.e., isSat(ψ<sup>1</sup> ∧ ψ2) if and only if ψ<sup>1</sup> = ψ2, for ψ<sup>1</sup> and ψ<sup>2</sup> minterms.

**Lemma 1 (Enabledness).** *Let* θ *be a register abstraction such that* θ<sup>r</sup> *is a minterm, for all* r ∈ R*. If* ϕ *is a minterm, then* p ϕ/ −−→ q *is enabled by* θ *iff:*

*(1) if* <sup>=</sup> <sup>r</sup><sup>=</sup>*, then* <sup>ϕ</sup> <sup>=</sup> <sup>θ</sup>r*; (2) if* <sup>=</sup> <sup>r</sup>•*, then* <sup>|</sup>ϕ| > *E* (θ, ϕ)*, where E* (θ, ϕ) = |{r ∈ R | θ<sup>r</sup> = ϕ}| *is the # of registers with values from* ϕ*.*

Intuitively, (1) says that if the transition reads a symbol stored in r satisfying ϕ, the symbol must also satisfy θr, the range of r. Because ϕ and θ<sup>r</sup> are minterms, this only happens when ϕ = θr. (2) says that the enabling condition ϕ \ img(v) = ∅, for all injective v |= θ, holds if and only if there are fewer registers storing values from ϕ than the cardinality of ϕ. That implies we can always find a fresh element in ϕ to enable the transition. Registers holding values from ϕ are exactly those r ∈ R such that θ<sup>r</sup> = ϕ. Both conditions can be effectively checked: the first one is a simple predicate-equivalence check, while the second one amounts to checking whether ϕ holds for at least a certain number k of distinct elements. This can be achieved by checking satisfiability of ϕ ∧ ¬atom(a1) ∧···∧¬atom(a<sup>k</sup>−<sup>1</sup>), for a1,...,a<sup>k</sup>−<sup>1</sup> distinct elements of ϕ.

*Remark 2.* Using single-valued SRAs to check enabledness might seem like a restriction. However, if one would start from a generic SRA, the process to check enabledness would contain an extra step: for each state p, we would have to keep track of all possible equations among registers. In fact, register equalities determine whether (i) register constraints of an outgoing transition are satisfiable; (ii) how many elements of the guard we need for the transition to happen, analogously to condition 2 of Lemma 1. Generating such equations is the key idea behind Theorem 1, and corresponds precisely to turning the SRA into a single-valued one.

Given any SRA, we can use the notion of register abstraction to build an equivalent *normalized* SRA, where (*i*) states keep track of how the domains of registers change along transitions, (*ii*) transitions are obtained by breaking the one of the original SRA into minterms and discarding the ones that are disabled according to Lemma 1. In the following we write mint(S) for the minterms for the set of predicates {ϕ | p ϕ/ −−→ q ∈ Δ}∪{atom(v0(r)) | v0(r) ∈ D, r ∈ R}. Observe that an atomic predicate always has an equivalent minterm, hence we will use atomic predicates to define the initial register abstraction.

**Definition 4 (Normalized SRA).** *Given an SRA* S*, its normalization* N(S) *is the SRA* (R, N(Q), N(q0), v0, N(F), N(Δ)) *where:*


The automaton N(S) enjoys the desired property: each transition from θ p is enabled by θ, by construction. N(S) is always *finite*. In fact, suppose S has n states, m transitions and r registers. Then N(S) has at most m predicates, and <sup>|</sup>mint(S)<sup>|</sup> is <sup>O</sup>(2m). Since the possible register abstractions are <sup>O</sup>(r2m), <sup>N</sup>(S) has O(nr2m) states and O(mr<sup>2</sup>2<sup>3</sup>m) transitions.

*Example 4.* We now show the normalized version of Example 3. The first step is computing the set mint(S) of minterms for S, i.e., the satisfiable Boolean combinations of {atom(0), div3, [0, 10] ∧ div5, < 0∨ > 10}. For simplicity, we represent minterms as bitvectors where a 0 component means that the corresponding predicate is negated, e.g., [1, 1, 1, 0] stands for the minterm atom(0)∧([0, 10]∧div3)∧ div<sup>5</sup> ∧ ¬(< 0∨ > 10). Minterms and the resulting SRA N(S) are shown below.

On each transition we show how it is broken down to minterms, and for each state we show the register abstraction (note that state 1 becomes two states in N(S)). The transition from 1 to 2 is *not* part of N(S) – this is why it is dotted. In fact, in every register abstraction [r → m] reachable at state 1, the component for the transition guard [0, 10]∧div<sup>5</sup> in the minterm m (3rd component) is 0, i.e., ([0, 10] ∧ div5) m. Intuitively, this means that r will never be assigned a value that satisfies [0, 10]∧div5. As a consequence, the construction of Definition 4 will not add a transition from 1 to 2.

Finally, we show that the normalized SRA behaves exactly as the original one.

**Proposition 4.** (p, v) ∼ (θ p, v)*, for all* p ∈ Q *and* v |= θ*. Hence,* S ∼ N(S)*.*

**Emptiness and Determinism.** The transitions of N(S) are always enabled by construction, therefore every path in N(S) always corresponds to a run in CLTS(N(S)).

**Lemma 2.** *The state* θp *is reachable in* N(S) *if and only if there is a reachable configuration* (θ p, v) *in* CLTS(N(S)) *such that* v |= θ*. Moreover, if* (θ p, v) *is reachable, then all configurations* (θ p, w) *such that* w |= θ *are reachable.*

Therefore, using Proposition 4, we can reduce the reachability and emptiness problems of S to that of N(S).

**Theorem 2 (Emptiness).** *There is an algorithm to decide reachability of any configuration of* S*, hence whether L* (S) = ∅*.*

*Proof.* Let (p, v) be a configuration of S. To decide whether it is reachable in CLTS(S), we can perform a visit of N(S) from its initial state, stopping when a state θ p such that v |= θ is reached. If we are just looking for a final state, we can stop at any state such that p ∈ F. In fact, by Proposition 4, there is a run in CLTS(S) ending in (p, v) if and only if there is a run in CLTS(N(S)) ending in (θ p, v) such that v |= θ. By Lemma 2, the latter holds if and only if there is a path in N(S) ending in θ p. This algorithm has the complexity of a standard visit of <sup>N</sup>(S), namely <sup>O</sup>(nr2<sup>m</sup> <sup>+</sup> mr<sup>2</sup>2<sup>3</sup>m).

Now that we characterized which transitions are reachable, we define what it means for a normalized SRA to be deterministic and we show that determinism is preserved by the translation from SRA.

**Proposition 5 (Determinism).** N(S) *is* deterministic *if and only if for all reachable transitions* p ϕ1/<sup>1</sup> −−−−→ q1, p ϕ2/<sup>2</sup> −−−−→ q<sup>2</sup> ∈ N(Δ) *the following holds:* ϕ<sup>1</sup> = ϕ<sup>2</sup> *whenever either (1)* <sup>1</sup> = <sup>2</sup> *and* q<sup>1</sup> = q2*, or; (2)* <sup>1</sup> = r•*,* <sup>2</sup> = s•*, and* r = s*;*

One can check determinism of an SRA by looking at its normalized version.

**Proposition 6.** S *is deterministic if and only if* N(S) *is deterministic.*

**Similarity and Bisimilarity.** We now introduce a symbolic technique to decide similarity and bisimilarity of SRAs. The basic idea is similar to *symbolic (bi)simulation* [20,27] for RAs. Recall that RAs are SRAs whose transition guards are all . Given two RAs S<sup>1</sup> and S<sup>2</sup> a symbolic simulation between them is defined over their state spaces Q<sup>1</sup> and Q2, not on their configurations. For this to work, one needs to add an extra piece of information about how registers of the two states are related. More precisely, a symbolic simulation is a relation on triples (p1, p2, σ), where p<sup>1</sup> ∈ Q1, p<sup>2</sup> ∈ Q<sup>2</sup> and σ ⊆ R<sup>1</sup> × R<sup>2</sup> is a *partial injective* function. This function encodes constraints between registers: (r, s) ∈ σ is an equality constraint between r ∈ R<sup>1</sup> and s ∈ R2, and (r, s) ∈/ σ is an inequality constraint. Intuitively, (p1, p2, σ) says that all configurations (p1, v1) and (p2, v2) such that v<sup>1</sup> and v<sup>2</sup> satisfy σ – e.g., v1(r) = v2(s) whenever (r, s) ∈ σ – are in the simulation relation (p1, v1) ≺ (p2, v2). In the following we will use v<sup>1</sup>  v<sup>2</sup> to denote the function encoding constraints among v<sup>1</sup> and v2, explicitly: σ(r) = s if and only if v1(r) = v2(s) and v1(r) = .

**Definition 5 (Symbolic (bi)similarity** [27]**).** *A symbolic simulation is a relation* R ⊆ Q<sup>1</sup> ×Q<sup>1</sup> ×P(R<sup>1</sup> ×R2) *such that if* (p1, p2, σ) ∈ R*, then* p<sup>1</sup> ∈ F<sup>1</sup> *implies* p<sup>2</sup> ∈ F2*, and if* p<sup>1</sup> −→ q<sup>1</sup> ∈ Δ<sup>1</sup> <sup>1</sup> *then:*

*1. if* = r<sup>=</sup>*:*

*(a) if* r ∈ dom(σ)*, then there is* p<sup>2</sup> σ(r)<sup>=</sup> −−−−→ q<sup>2</sup> ∈ Δ<sup>2</sup> *such that* (q1, q2, σ) ∈ R*.*

*(b) if* r /∈ dom(σ) *then there is* p<sup>2</sup> s*•* −→ q<sup>2</sup> ∈ Δ<sup>2</sup> *s.t.* (q1, q2, σ[r → s]) ∈ R*.*

<sup>1</sup> We will keep the guard implicit for succinctness.

	- *(a) for all* s ∈ R<sup>2</sup> \ img(σ)*, there is* p<sup>2</sup> s<sup>=</sup> −−→ q<sup>2</sup> ∈ Δ<sup>2</sup> *such that* (q1, q2, σ[r → s]) ∈ R*, and;*
	- *(b) there is* p<sup>2</sup> s*•* −→ q<sup>2</sup> ∈ Δ<sup>2</sup> *such that* (q1, q2, σ[r → s]) ∈ R*.*

*Here* <sup>σ</sup>[<sup>r</sup> <sup>→</sup> <sup>s</sup>] *stands for* <sup>σ</sup> \ (σ−<sup>1</sup>(s), s) <sup>∪</sup> (r, s)*, which ensures that* <sup>σ</sup> *stays injective when updated.*

*Given a symbolic simulation* <sup>R</sup>*, its inverse is defined as* <sup>R</sup>−<sup>1</sup> <sup>=</sup> {<sup>t</sup> <sup>−</sup><sup>1</sup> <sup>|</sup> <sup>t</sup> <sup>∈</sup> <sup>R</sup>}*, where* (p1, p2, σ)−<sup>1</sup> = (p2, p1, σ−1)*. A* symbolic bisimulation R *is a relation such that both* R *and* R−<sup>1</sup> *are symbolic simulations.*

Case 1 deals with cases when p<sup>1</sup> can perform a transition that reads the register r. If r ∈ dom(σ), meaning that r and σ(r) ∈ R<sup>2</sup> contain the same value, then p<sup>2</sup> must be able to read σ(r) as well. If r /∈ dom(σ), then the content of r is fresh w.r.t. p2, so p<sup>2</sup> must be able to read any fresh value—in particular the content of r. Case 2 deals with the cases when p<sup>1</sup> reads a fresh value. It ensures that p<sup>2</sup> is able to read all possible values that are fresh for p1, be them already in some register s – i.e., s ∈ R<sup>2</sup> \ img(σ), case 2(a) – or fresh for p<sup>2</sup> as well – case 2(b). In all these cases, σ must be updated to reflect the new equalities among registers.

Keeping track of equalities among registers is enough for RAs, because the actual content of registers does not determine the capability of a transition to fire (RA transitions have implicit guards). As seen in Example 3, this is no longer the case for SRAs: a transition may or may not happen depending on the register assignment being compatible with the transition guard.

As in the case of reachability, normalized SRAs provide the solution to this problem. We will reduce the problem of checking (bi)similarity of S<sup>1</sup> and S<sup>2</sup> to that of checking symbolic (bi)similarity on N(S1) and N(S2), with minor modifications to the definition. To do this, we need to assume that minterms for both N(S1) and N(S2) are computed over the union of predicates of S<sup>1</sup> and S2.

**Definition 6 (**N**-simulation).** *A N-simulation on* S<sup>1</sup> *and* S<sup>2</sup> *is a relation* R ⊆ N(Q1) × N(Q2) × P(R<sup>1</sup> × R2)*, defined as in Definition 5, with the following modifications:*


*2(a)' for all* s ∈ R<sup>2</sup> \ img(σ) such that ϕ<sup>1</sup> = (θ2)s*, there is* θ<sup>2</sup> p<sup>2</sup> ϕ1/s<sup>=</sup> −−−−→ θ <sup>2</sup> q<sup>2</sup> ∈ N(Δ2) *such that* (θ <sup>1</sup> q1, θ <sup>2</sup> q2, σ[r → s]) ∈ R*, and;*

$$\frac{2^{\varrho}(b)'}{\mathsf{N}(\Delta\_2)} \xrightarrow[\begin{subarray}{c} \mathsf{N}(\Delta\_2) \text{ such that } (\theta\_1^{\varrho} \rhd\_1) \\ \hline \mathsf{N}(\Delta\_2) \text{ such that } (\theta\_1^{\prime} \rhd\_1, \theta\_2^{\prime} \rhd\_2, \sigma[r \mapsto s]) \in \mathsf{R}. \end{subarray}}{\mathsf{N}(\Delta\_2)} \xrightarrow{\begin{subarray}{c} \varphi\_1 \land \varphi\_2 \in \mathsf{T} \ \mathsf{S}(r \mapsto s[\cdot]) \in \mathsf{R}. \end{subarray}} \begin{subarray}{c} \theta\_2^{\prime} \rhd\_2 \ \mathsf{S}(r \mapsto s[\cdot]) \in \mathsf{R}. \end{subarray}}$$

*A N-bisimulation* R *is a relation such that both* R *and* R−<sup>1</sup> *are N-simulations. We write* S<sup>1</sup> N ≺ S<sup>2</sup> *(resp.* S<sup>1</sup> N ∼ S2*) if there is a N-simulation (resp. bisimulation)* R *such that* (N(q01), N(q02), v<sup>01</sup>  v02) ∈ R*.*

The intuition behind this definition is as follows. Recall that, in a normalized SRA, transitions are defined over minterms, which cannot be further broken down, and are mutually disjoint. Therefore two transitions can read the same values if and only if they have the same minterm guard. Thus condition (i) makes sure that matching transitions can read exactly the same set of values. Analogously, condition (ii) restricts how a fresh transition of N(S1) must be matched by one of N(S2): 2(a)' only considers transitions of N(S2) reading registers s ∈ R<sup>2</sup> such that ϕ<sup>1</sup> = (θ2)<sup>s</sup> because, by definition of normalized SRA, θ<sup>2</sup> p<sup>2</sup> has no such transition if this condition is not met. Condition 2(b)' amounts to requiring a fresh transition of N(S2) that is enabled by both θ<sup>1</sup> and θ<sup>2</sup> (see Lemma 1), i.e., that can read a symbol that is fresh w.r.t. both N(S1) and N(S2).

N-simulation is sound and complete for standard simulation.

#### **Theorem 3.** S<sup>1</sup> ≺ S<sup>2</sup> *if and only if* S<sup>1</sup> N ≺ S2*.*

As a consequence, we can decide similarity of SRAs via their normalized versions. N-simulation is a relation over a finite set, namely N(Q1)×N(Q2)×P(R<sup>1</sup> ×R2), therefore N-similarity can always be decided in finite time. We can leverage this result to provide algorithms for checking language inclusion/equivalence for deterministic SRAs (recall that they are undecidable for non-deterministic ones).

**Theorem 4.** *Given two deterministic SRAs* S<sup>1</sup> *and* S2*, there are algorithms to decide L* (S1) ⊆ *L* (S2) *and L* (S1) = *L* (S2)*.*

*Proof.* By Proposition 1 and Theorem 3, we can decide *L* (S1) ⊆ *L* (S2) by checking S<sup>1</sup> N ≺ S2. This can be done algorithmically by iteratively building a relation R on triples that is an N-simulation on N(S1) and N(S2). The algorithm initializes R with (N(q01), N(q02), v<sup>01</sup>  v02), as this is required to be in R by Definition 6. Each iteration considers a candidate triple t and checks the conditions for N-simulation. If satisfied, it adds t to R and computes the next set of candidate triples, i.e., those which are required to belong to the simulation relation, and adds them to the list of triples still to be processed. If not, the algorithm returns *L* (S1) ⊆ *L* (S2). The algorithm terminates returning *L* (S1) ⊆ *L* (S2) when no triples are left to process. Determinism of S<sup>1</sup> and S2, and hence of N(S1) and N(S2) (by Proposition 6), ensures that computing candidate triples is deterministic. To decide *L* (S1) = *L* (S2), at each iteration we need to check that both t and t <sup>−</sup><sup>1</sup> satisfy the conditions for N-simulation.

If S<sup>1</sup> and S<sup>2</sup> have, respectively, n1, n<sup>2</sup> states, m1, m<sup>2</sup> transitions, and r1, r<sup>2</sup> registers, the normalized versions have O(n1r12<sup>m</sup><sup>1</sup> ) and O(n2r22<sup>m</sup><sup>2</sup> ) states. Each triple, taken from the finite set N(Q1)×N(Q2)×P(R1×R2), is processed exactly once, so the algorithm iterates <sup>O</sup>(n1n2r1r22<sup>m</sup>1+m2+r1r<sup>2</sup> ) times.

#### **5 Evaluation**

We have implemented SRAs in the open-source Java library SVPALib [26]. In our implementation, constructions are computed lazily when possible (e.g., the normalized SRA for emptiness and (bi)similarity checks). All experiments were performed on a machine with 3.5 GHz Intel Core i7 CPU with 16 GB of RAM (JVM 8 GB), with a timeout value of 300 s. The goal of our evaluation is to answer the following research questions:


**Benchmarks.** We focus on regular expressions with back-references, therefore all our benchmarks operate over the Boolean algebra of Unicode characters with interval—i.e., the set of characters is the set of all 2<sup>16</sup> UTF-16 characters and the predicates are union of intervals (e.g., [a-zA-Z]).<sup>2</sup> Our benchmark set contains 19 SRAs that represent variants of regular expressions with back-references obtained from the regular-expression crowd-sourcing website RegExLib [23]. The expressions check whether inputs have, for example, matching first/last name initials or both (Name-F, Name-L and Name), correct Product Codes/Lot number of total length n (Pr-Cn, Pr-CLn), matching XML tags (XML), and IP addresses that match for n positions (IPn). We also create variants of the product benchmark presented in Sect. 2 where we vary the numbers of characters in the code and lot number. All the SRAs are deterministic.

#### **5.1 Succinctness of SRAs vs SFAs**

In this experiment, we relate the size of SRAs over finite alphabets to the size of the smallest equivalent SFAs. For each SRA, we construct the equivalent SFA by equipping the state space with the values stored in the registers at each step (this construction effectively builds the configuration LTS). Figure 2a shows the results. As expected, SFAs tend to blow up in size when the SRA contains multiple registers or complex register values. In cases where the register values range over small sets (e.g., [0-9]) it is often feasible to build an SFA equivalent to the SRA, but the construction always yields very large automata. In cases where the registers can assume many values (e.g., 2<sup>16</sup>) SFAs become prohibitively large and do not fit in memory. To answer **Q1**, even for finite alphabets, **it is not feasible to compile SRAs to SFAs**. Hence, SRAs are a succinct model.

#### **5.2 Performance of Membership Checking**

In this experiment, we measure the performance of SRA membership, and we compare it with the performance of the java.util.regex matching algorithm.

<sup>2</sup> Our experiments are over finite alphabets, but the Boolean algebra can be infinite by taking the alphabet to be positive integers and allowing intervals to contain ∞ as upper bound. This modification does not affect the running time of our procedures, therefore we do not report it.


(a) Size of SRAs vs SFAs. (—) denotes the SFA didn't fit in memory. *|*reg*|* denotes how many different characters a register stored.


IP-6 IP-9 —— — (b) Performance of decision procedures. In the table *L<sup>i</sup>* = *L* (S*i*), for *i* = 1*,* 2.

(c) SRA membership and Java regex matching performance. Missing data points for Java are stack overflows.

input length

**Fig. 2.** Experimental results.

For each benchmark, we generate inputs of length varying between approximately 100 and 10<sup>8</sup> characters and measure the time taken to check membership. Figure 2c shows the results. The performance of SRA (resp. Java) is not particularly affected by the size of the expression. Hence, the lines for different expressions mostly overlap. As expected, for SRAs the time taken to check membership grows linearly in the size of the input (axes are log scale). Remarkably, even though our implementation does not employ particular input processing optimizations, it can still check membership for strings with tens of millions of characters in less than 10 s. We have found that our implementation is more efficient than the Java regex library, matching the same input an average of 50 times faster than java.util.regex.Matcher. java.util.regex.Matcher seems to make use of a recursive algorithm to match back-references, which means it does not scale well. Even when given the maximum stack size, the JVM will return a Stack Overflow for inputs as small as 20,000 characters. Our implementation can match such strings in less than 2 s. To answer **Q2**, **deterministic SRAs can be efficiently executed on large inputs and perform better than the** java.util.regex **matching algorithm**.

#### **5.3 Performance of Decision Procedures**

In this experiment, we measure the performance of SRAs simulation and bisimulation algorithms. Since all our SRAs are deterministic, these two checks correspond to language equivalence and inclusion. We select pairs of benchmarks for which the above tests are meaningful (e.g., variants of the problem discussed at the end of Sect. 2). The results are shown in Fig. 2b. As expected, due to the translation to single-valued SRAs, our decision procedures do not scale well in the number of registers. This is already the case for classic register automata and it is not a surprising result. However, our technique can still check equivalence and inclusion for regular expressions that no existing tool can handle. To answer **Q3**, **bisimulation and simulation algorithms for SRAs only scale to small numbers of registers**.

#### **6 Conclusions**

In this paper we have presented *Symbolic Register Automata*, a novel class of automata that can handle complex alphabet theories while allowing symbol comparisons for equality. SRAs encompass – and are strictly more powerful – than both Register and Symbolic Automata. We have shown that they enjoy the same closure and decidability properties of the former, despite the presence of arbitrary guards on transitions, which are not allowed by RAs. Via a comprehensive set of experiments, we have concluded that SRAs are vastly more succinct than SFAs and membership is efficient on large inputs. Decision procedures do not scale well in the number of registers, which is already the case for basic RAs.

**Related Work.** RAs were first introduced in [17]. There is an extensive literature on register automata, their formal languages and decidability properties [7,13,21,22,25], including variants with *global freshness* [20,27] and totally ordered data [4,14]. SRAs are based on the original model of [17], but are much more expressive, due to the presence of guards from an arbitrary decidable theory.

In recent work, variants over richer theories have appeared. In [9] RA over rationals were introduced. They allow for a restricted form of linear arithmetic among registers (RAs with arbitrary linear arithmetic subsume two-counter automata, hence are undecidable). SRAs do not allow for operations on registers, but encompass a wider range of theories without any loss in decidability. Moreover, [9] does not study Boolean closure properties. In [8,16], RAs allowing guards over a range of theories – including (in)equality, total orders and increments/sums – are studied. Their focus is different than ours as they are interested primarily in *active learning* techniques, and several restrictions are placed on models for the purpose of the learning process. We can also relate SRAs with *Quantified Event Automata* [2], which allow for guards and assignments to registers on transitions. However, in QEA guards can be arbitrary, which could lead to several problems, e.g. undecidable equivalence.

Symbolic automata were first introduced in [28] and many variants of them have been proposed [12]. The one that is closer to SRAs is Symbolic Extended Finite Automata (SEFA) [11]. SEFAs are SFAs in which transitions can read more than one character at a time. A transition of arity k reads k symbols which are consumed if they satisfy the predicate ϕ(x1,...,xk). SEFAs allow arbitrary k-ary predicates over the input theory, which results in most problems being undecidable (e.g., equivalence and intersection emptiness) and in the model not being closed under Boolean operations. Even when deterministic, SEFAs are not closed under union and intersection. In terms of expressiveness, SRAs and SEFAs are incomparable. SRAs can only use equality, but can compare symbols at arbitrary points in the input while SEFAs can only compare symbols within a constant window, but using arbitrary predicates.

Several works study matching techniques for extended regular expressions [3,5,18,24]. These works introduce automata models with ad-hoc features for extended regular constructs – including back-references – but focus on efficient matching, without studying closure and decidability properties. It is also worth noting that SRAs are not limited to alphanumeric or finite alphabets. On the negative side, SRAs cannot express capturing groups of an unbounded length, due to the finitely many registers. This limitation is essential for decidability.

**Future Work.** In [21] a polynomial algorithm for checking language equivalence of deterministic RAs is presented. This crucially relies on closure properties of symbolic bisimilarity, some of which are lost for SRAs. We plan to investigate whether this algorithm can be adapted to our setting. Extending SRAs with more complex comparison operators other than equality (e.g., a total order <) is an interesting research question, but most extensions of the model quickly lead to undecidability. We also plan to study active automata learning for SRAs, building on techniques for SFAs [1], RAs [6,8,16] and nominal automata [19].

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Abstraction Refinement Algorithms for Timed Automata**

Victor Roussanaly, Ocan Sankur, and Nicolas Markey(B)

Univ Rennes, Inria, CNRS, IRISA, Rennes, France nmarkey@irisa.fr

**Abstract.** We present abstraction-refinement algorithms for model checking safety properties of timed automata. The abstraction domain we consider abstracts away zones by restricting the set of clock constraints that can be used to define them, while the refinement procedure computes the set of constraints that must be taken into consideration in the abstraction so as to exclude a given spurious counterexample. We implement this idea in two ways: an enumerative algorithm where a lazy abstraction approach is adopted, meaning that possibly different abstract domains are assigned to each exploration node; and a symbolic algorithm where the abstract transition system is encoded with Boolean formulas.

#### **1 Introduction**

Model checking [4,10,12,26] is an automated technique for verifying that the set of behaviors of a computer system satisfies a given property. Model-checking algorithms explore finite-state automata (representing the system under study) in order to decide if the property holds; if not, the algorithm returns an explanation. These algorithms have been extended to verify real-time systems modelled as timed automata [2,3], an extension of finite automata with clock variables to measure and constrain the amount of time elapsed between occurrences of transitions. The state-space exploration can be done by representing clock constraints efficiently using convex polyhedra called *zones* [8,9]. Algorithms based on this data structure have been implemented in several tools such as Uppaal [7], and have been applied in various industrial cases.

The well-known issue in the applications of model checking is the *state-space explosion* problem: the size of the state space grows exponentially in the size of the description of the system. There are several sources for this explosion: the system might be made of the composition of several subsystems (such as a distributed system), it might contain several discrete variables (such as in a piece of software), or it might contain a number of real-valued clocks as in our case.

This work was funded by ANR project Ticktac (ANR-18-CE40-0015) and by ERC grant EQualIS (StG-308087).

Numerous attempts have been made to circumvent this problem. Abstraction is a generic approach that consists in simplifying the model under study, so as to make it easier to verify [13]. *Existential* abstraction may only add extra behaviors, so that when a safety property holds in an abstracted model, it also holds in the original model; if on the other hand a safety property fails to hold, the model-checking algorithms return a witness trace exhibiting the non-safe behaviour: this either invalidates the property on the original model, if the trace exists in that model, or gives information about how to automatically refine the abstraction. This approach, named CEGAR (counter-example guided abstraction refinement) [11], was further developed and used, for instance, in software verification (BLAST [20], SLAM [5], ...).

The CEGAR approach has been adapted to timed automata, e.g. in [14, 18], but the abstractions considered there only consist in removing clocks and discrete variables, and adding them back during refinement. So for most welldesigned models, one ends up adding all clocks and variables which renders the method useless. Two notable exceptions are [22], in which the zone extrapolation operators are dynamically adapted during the exploration, and [29], in which zones are refined when needed using interpolants. Both approaches define "exact" abstractions in the sense that they make sure that all traces discovered in the abstract model are feasible in the concrete model at any time.

In this work, we consider a more general setting and study *predicate abstractions* on clock variables. Just like in software model checking, we define abstract state spaces using these predicates, where the values of the clocks and their relations are approximately represented by these predicates. New predicates are generated if needed during the refinement step. We instantiate our approach by two algorithms. The first one is a zone-based enumerative algorithm inspired by the *lazy abstraction* in software model checking [19], where we assign a possibly different abstract domain to each node in the exploration. The second algorithm is based on binary decision diagrams (BDD): by exploiting the observation that a small number of predicates was often sufficient to prove safety properties, we use an efficient BDD encoding of zones similar to one introduced in early work [28].

Let us explain the abstract domains we consider. Assume there are two clock variables x and y. The abstraction we consider consists in restricting the clock

(a) Abstraction of zone 1 ≤ *x, y* ≤ 2 (b) Abstraction of zone *y* ≤ 1 ∧ 1 ≤ *x*−*y* ≤ 2

**Fig. 1.** The abstract domain is defined by the clock constraints shown in thick red lines. In each example, the abstraction of the zone shown on the left (shaded area) is the larger zone on the right. (Color figure online)

constraints that can be used when defining zones. Assume that we only allow to compare x with 2 or 3; that y can only be compared with 2, and x−y can only be compared with −1 or 2. Then any conjunction of constraints one might obtain in this manner will be delimited by the thick red lines in Fig. 1; one cannot define a finer region under this restriction. The figure shows the abstraction process: given a "concrete" zone, its abstraction is the smallest zone which is a superset and is definable under our restriction. For instance, the abstraction of 1 ≤ x, y ≤ 2 is 0 ≤ x, y ≤ 2 ∧ −1 ≤ x − y (cf. Fig. 1a).

*Related Works.* We give more detail on zone abstractions in timed automata. Most efforts in the literature have been concentrated in designing zone abstraction operators that are exact in the sense that they preserve the reachability relation between the locations of a timed automaton; see [6]. The idea is to determine bounds on the constants to which a given clock can be compared to in a given part of the automaton, since the clock values do not matter outside these bounds. In [21,22], the authors give an algorithm where these bounds are dynamically adapted during the exploration, which allows one to obtain coarser abstractions. In [29], the exploration tree contains pairs of zones: a concrete zone as in the usual algorithm, and a coarser abstract zone. The algorithm explores all branches using the coarser zone and immediately refines the abstract zone whenever an edge which is disabled in the concrete zone is enabled. In [17], a CEGAR loop was used to solve timed games by analyzing strategies computed for each abstract game. The abstraction consisted in collapsing locations.

Some works have adapted the abstraction-refinement paradigm to timed automata. In [14], the authors apply "localization reduction" to timed automata within an abstraction-refinement loop: they abstract away clocks and discrete variables, and only introduce them as they are needed to rule out spurious counterexamples. A more general but similar approach was developed in [18]. In [31], the authors adapt the trace abstraction refinement idea to timed automata where a finite automaton is maintained to rule out infeasible edge sequences.

The CEGAR approach was also used recently in the LinAIG framework for verifying linear hybrid automata [1]. In this work, the backward reachability algorithm exploits *don't-cares* to reduce the size of the Boolean circuits representing the state space. The abstractions consist in enlarging the size of *don't-cares* to reduce the number of linear predicates used in the representation.

#### **2 Timed Automata and Zones**

#### **2.1 Timed Automata**

Given a finite set of clocks <sup>C</sup>, we call *valuations* the elements of <sup>R</sup><sup>C</sup> <sup>≥</sup><sup>0</sup>. For a clock valuation v, a subset R ⊆ C, and a non-negative real d, we denote with v[R ← d] the valuation w such that w(x) = v(x) for x ∈C\ R and w(x) = d for x ∈ R, and with v + d the valuation w such that w (x) = v(x) + d for all x ∈ C. We extend these operations to sets of valuations in the obvious way. We write **0** for the valuation that assigns 0 to every clock. An *atomic guard* is a formula of

the form <sup>x</sup> <sup>≺</sup> <sup>k</sup> or <sup>x</sup>−<sup>y</sup> <sup>≺</sup> <sup>k</sup> with x, y ∈ C, <sup>k</sup> <sup>∈</sup> <sup>N</sup>, and ≺∈{<, <sup>≤</sup>, >, ≥}. A *guard* is a conjunction of atomic guards. A valuation v satisfies a guard g, denoted v |= g, if all atomic guards hold true when each x ∈ C is replaced with v(x). Let [[g]] = {<sup>v</sup> <sup>∈</sup> <sup>R</sup><sup>C</sup> <sup>≥</sup><sup>0</sup> <sup>|</sup> <sup>v</sup> <sup>|</sup><sup>=</sup> <sup>g</sup>} denote the set of valuations satisfying <sup>g</sup>. We write Φ<sup>C</sup> for the set of guards built on C.

A *timed automaton* A is a tuple (L, Inv, 0, C, E), where L is a finite set of locations, Inv: L → Φ<sup>C</sup> defines location invariants, C is a finite set of clocks, E ⊆L×Φ<sup>C</sup> × 2<sup>C</sup> × L is a set of edges, and <sup>0</sup> ∈ L is the initial location. An edge e = (, g, R, ) is also written as g,R −−→ . For any location , we let E() denote the set of edges leaving .

<sup>A</sup> *configuration* of <sup>A</sup> is a pair <sup>q</sup> = (, v) ∈ L× <sup>R</sup><sup>C</sup> <sup>≥</sup><sup>0</sup> such that <sup>v</sup> <sup>|</sup><sup>=</sup> Inv(). <sup>A</sup> *run* of <sup>A</sup> is a sequence <sup>q</sup>1e1q2e<sup>2</sup> ...qn where for all <sup>i</sup> <sup>≥</sup> 1, <sup>q</sup>i = (i, vi) is a configuration, and either <sup>e</sup>i <sup>∈</sup> <sup>R</sup>><sup>0</sup>, in which case <sup>q</sup>i+1 = (i, vi <sup>+</sup> <sup>e</sup>i), or <sup>e</sup>i = (i, gi, Ri, i+1) <sup>∈</sup> <sup>E</sup>, in which case <sup>v</sup>i <sup>|</sup><sup>=</sup> <sup>g</sup>i and <sup>q</sup>i+1 = (i+1, vi[Ri <sup>←</sup> 0]). A *path* is a sequence of edges with matching endpoint locations.

#### **2.2 Zones and DBMs**

Several tools for timed automata implement algorithms based on *zones*, which are particular polyhedra definable with clock constraints. Formally, a zone Z is a subset of R<sup>C</sup> <sup>≥</sup><sup>0</sup> definable by a guard in <sup>Φ</sup>C.

We recall a few basic operations defined on zones. First, the intersection Z∩Z of two zones Z and Z is clearly a zone. Given a zone Z, the set of time-successors of <sup>Z</sup>, defined as <sup>Z</sup><sup>↑</sup> <sup>=</sup> {<sup>v</sup> <sup>+</sup> <sup>t</sup> <sup>∈</sup> <sup>R</sup><sup>C</sup> <sup>≥</sup><sup>0</sup> <sup>|</sup> <sup>t</sup> <sup>∈</sup> <sup>R</sup>≥0, v <sup>∈</sup> <sup>Z</sup>}, is easily seen to be a zone; similarly for time-predecessors <sup>Z</sup><sup>↓</sup> <sup>=</sup> {<sup>v</sup> <sup>∈</sup> <sup>R</sup><sup>C</sup> <sup>≥</sup><sup>0</sup> | ∃<sup>t</sup> <sup>≥</sup> <sup>0</sup>. v <sup>+</sup> <sup>t</sup> <sup>∈</sup> <sup>Z</sup>}. Given <sup>R</sup> ⊆ C, we let ResetR(Z) be the zone {v[<sup>R</sup> <sup>←</sup> 0] <sup>∈</sup> <sup>R</sup><sup>C</sup> <sup>≥</sup><sup>0</sup> <sup>|</sup> <sup>v</sup> <sup>∈</sup> <sup>Z</sup>}, and Freex(Z) = {v <sup>∈</sup> <sup>R</sup><sup>C</sup> <sup>≥</sup><sup>0</sup> | ∃<sup>v</sup> <sup>∈</sup> Z, d <sup>∈</sup> <sup>R</sup>≥0, v <sup>=</sup> <sup>v</sup>[<sup>x</sup> <sup>←</sup> <sup>d</sup>]}.

Zones can be represented as *difference-bound matrices (DBM)* [8,15]. Let C<sup>0</sup> = C∪{0}, where 0 is an extra symbol representing a special clock variable whose value is always 0. A DBM is a |C0| × |C0|-matrix taking values in (Z× {<, ≤})∪ {(+∞, <)}. Intuitively, cell (x, y) of a DBM <sup>M</sup> stores a pair (d, <sup>≺</sup>) representing an upper bound on the difference x−y. For any DBM M, we let [[M]] denote the zone it defines.

While several DBMs can represent the same zone, each zone admits a *canonical* representation, which is obtained by storing the tightest clock constraints defining the zone. This canonical representation can be obtained by computing shortest paths in a graph where the vertices are clocks and the edges weighted by clock constraints, with natural addition and comparison of elements of (<sup>Z</sup> × {<, ≤})∪ {(+∞, <)}. This graph has a negative cycle if, and only if, the associated DBM represents the empty zone.

All the operations on zones can be performed efficiently (in O(|C0| <sup>3</sup>)) on their associated DBMs while maintaining reduced form. For instance, the intersection N = Z ∩ Z of two canonical DBMs Z and Z can be obtained by first computing the DBM M = min(Z, Z ) such that M(x, y) = min{Z(x, y), Z (x, y)} for all (x, y) ∈ C<sup>0</sup> <sup>2</sup>, and then turning M into canonical form. We refer to [8] for

full details. By a slight abuse of notation, we use the same notations for DBMs as for zones, writing e.g. M = M↑, where M and M are reduced DBMs such that [[M ]] = [[M]]↑. Given an edge e = (, g, R, ), and a zone Z, we define Poste(Z) = Inv( ) <sup>∩</sup> (<sup>g</sup> <sup>∩</sup> ResetR(Z))↑, and Pree(Z)=(<sup>g</sup> <sup>∩</sup> FreeR(Inv( ) ∩ Z))↓. For a path <sup>ρ</sup> <sup>=</sup> <sup>e</sup>1e<sup>2</sup> ...en, we define Postρ and Preρ by iteratively applying Poste*<sup>i</sup>* and Pree*<sup>i</sup>* respectively.

#### **2.3 Clock-Predicate Abstraction and Interpolation**

For all clocks <sup>x</sup> and <sup>y</sup> in <sup>C</sup>0, we consider a finite set <sup>D</sup>x,y <sup>⊆</sup> <sup>N</sup>×{≤, <}, and gather these in a table <sup>D</sup> = (Dx,y)x,y∈C<sup>0</sup> . <sup>D</sup> is the *abstract domain* which restricts zones to be defined only using constraints of the form <sup>x</sup> <sup>−</sup> <sup>y</sup> <sup>≺</sup> <sup>k</sup> with (k, <sup>≺</sup>) ∈ Dx,y, as seen earlier. Let us call <sup>D</sup> the *concrete domain* if <sup>D</sup>x,y <sup>=</sup> <sup>N</sup> × {≤, <} for all x, y ∈ C0. A zone Z is D-definable if there exists a DBM D such that Z = [[D]] and <sup>D</sup>(x, y) ∈ Dx,y for all x, y ∈ C0. Note that we do not require this witness DBM D to be reduced; the reduction of such a DBM might introduce additional values. We say that domain D is a *refinement* of D if for all x, y ∈ C0, we have <sup>D</sup>x,y ⊆ D x,y.

An abstract domain <sup>D</sup> induces an *abstraction function* <sup>α</sup><sup>D</sup> : 2<sup>R</sup><sup>C</sup> <sup>≥</sup><sup>0</sup> → 2<sup>R</sup><sup>C</sup> <sup>≥</sup><sup>0</sup> where αD(Z) is the smallest D-definable zone containing Z. For any reduced DBM D, αD([[D]]) can be computed by setting D (x, y) = min{(k, ≺) ∈ Dx,y <sup>|</sup> <sup>D</sup>(x, y) <sup>≤</sup> (k, <sup>≺</sup>)} (with min <sup>∅</sup> = (∞, <)).

An *interpolant* for a pair of zones (Z1, Z2) with Z<sup>1</sup> ∩ Z<sup>2</sup> = ∅ is a zone Z<sup>3</sup> with <sup>Z</sup><sup>1</sup> <sup>⊆</sup> <sup>Z</sup><sup>3</sup> and <sup>Z</sup><sup>3</sup> <sup>∩</sup> <sup>Z</sup><sup>2</sup> <sup>=</sup> <sup>∅</sup><sup>1</sup> [29]. We use interpolants to refine our abstractions; in order not to add too many new constraints when refining, our aim is to find *minimal interpolants*: define the density of a DBM D as d(D)=#{(x, y) ∈ C<sup>0</sup> <sup>2</sup> <sup>|</sup> <sup>D</sup>(x, y) = (∞, <)}. Notice that while any pair of disjoint convex polyhedra can be separated by hyperplanes, not all pairs of disjoint zones admit interpolants of density 1; this is because not all (half-spaces delimited by) hyperplanes are zones. Still, we can bound the density of a minimal interpolant:

**Lemma 1.** *For any pair of disjoint, non-empty zones* (A, B)*, there exists an interpolant of density less than or equal to* |C0|/2*.*

By adapting the algorithm of [29] for computing interpolants, we can compute minimal interpolants efficiently:

**Proposition 2.** *Computing a minimal interpolant can be performed in* <sup>O</sup>(|C|<sup>4</sup>)*.*

#### **3 Enumerative Algorithm**

The first type of algorithm we present is a zone-based enumerative algorithm based on the clock-predicate abstractions. Let us first describe the overall

<sup>1</sup> It is sometimes also required that the interpolant only involves clocks that have non-trivial constraints in both Z<sup>1</sup> and Z2. We do not impose this requirement in our definition, but it will hold true in the interpolants computed by our algorithm.

algorithm in Algorithm 1, which is a typical abstraction-refinement loop. We then explain how the abstract reachability and refinement procedures are instantiated.


The initialization at line 1 chooses an abstract domain for the initial state, which can be either empty (thus the coarsest abstraction) or defined according to some heuristics. The algorithm maintains the wait and passed lists that are used in the forward exploration. As usual, the wait list can be implemented as a stack, a queue, or another priority list that determines the search order. The algorithm also uses covering nodes. Indeed if there are two node n and n , with n ∈ passed, n ∈ wait, n. = n ., and n .z ⊆ n.Z, then we know that every location reachable from n is also reachable from n. Since we have already explored n and we generated its successors, there is no need to explore the successors of n . The algorithm explicitly creates an exploration tree: line 2 creates a node containing location 0, zone **0**↑, and the abstract domain D<sup>0</sup> as the root of our tree, and adds this to the wait list. More details on the tree are given in the next subsection. Procedure AbsReach then looks for a trace to the target location T . If such a trace exists, line 9 checks its feasibility. Here <sup>π</sup> is a sequence of node and edges of A. The feasibility check is done by computing predecessors with zones starting from the final state, without using the abstraction function. If the last zone intersects our initial zone, this means that the trace is feasible. More details are given in Sect. 3.2.

#### **3.1 Abstract Forward Reachability: AbsReach**

We give a generic algorithm independently from the implementations of the abstraction functions and the refinement procedure.

Algorithm 2 describes the reachability procedure under a given abstract domain D. It is similar to the standard forward reachability algorithm using a wait-list and a passed-list. We explicitly create an exploration tree where the leaves are nodes in wait, covered nodes, or nodes that have no non-empty successors. Each node n contains the fields , Z which are labels describing the current location and zone; field covered points to a node covering the current node (it is undefined if the current node is not (known to be) covered); field parent points to the parent node in the tree (it is undefined for the root); and field D is the abstract domain associated with the node. Thus, the algorithm uses a possibly different abstract domain for each node in the exploration tree.

The difference of our algorithm w.r.t. the standard reachability can be seen at lines 8 and 11. At line 8, we apply the abstraction function to the zone taken from the wait-list before adding it to the passed-list. The abstraction function α is a function of a zone Z and a node n. This allows one to define variants with different dependencies; for instance, α might depend on the abstract domain n.D at the current node, but it can also use other information available in n or on the path ending in <sup>n</sup>. For now, it is best to think of <sup>α</sup> simply as <sup>Z</sup> → <sup>α</sup>n.<sup>D</sup>(Z). At line 11, the function choose-dom chooses an abstract domain for the node n . The domain could be chosen global for all nodes, or local to each node. A good trade-off, which we used in our experiments, is to have domains associated with locations of the timed automaton.

*Remark 1.* Note that we use the abstraction function when the node is inserted in the passed list. This is because we want the node to contain the smallest zone possible when we test whether the node is covered. We only need to use the abstracted zone when we compute its successor and when we test whether the node is covering. This allows us to store a unique zone.

As a first step towards proving correctness of our algorithm, we show that the following property is preserved by Algorithm AbsReach:

For all nodes <sup>n</sup> in passed, for all edges <sup>e</sup> from n., if Poste(n.Z) <sup>=</sup> <sup>∅</sup>, then <sup>n</sup> has a child <sup>n</sup> such that Poste(n.Z) <sup>⊆</sup> <sup>n</sup> .Z. If n is in passed, then we also have <sup>α</sup>n.<sup>D</sup>(Poste(n.Z)) <sup>⊆</sup> <sup>n</sup> .Z. (1)

**Lemma 3.** *Algorithm* AbsReach *preserves Property* (1)*.*

Note that although we use inclusion in Property (1), AbsReach would actually preserve equality of zones, but we will not always have equality before running AbsReach. This is because Refine might change the zones of some nodes without updating the zones of all their descendants.

#### **3.2 Refinement: Refine**

We now describe our refinement procedure Refine. Let us now assume that AbsReach returns π = A<sup>1</sup> <sup>σ</sup><sup>1</sup> −→ <sup>A</sup><sup>2</sup> <sup>σ</sup><sup>2</sup> −→ ... <sup>σ</sup>*k*−<sup>1</sup> −−−→ <sup>A</sup>k, and write <sup>D</sup>i for the domain associated with each <sup>A</sup>i. We write <sup>C</sup><sup>1</sup> for the initial concrete zone, and for i<k, we define <sup>C</sup>i+1 <sup>=</sup> Postσ*<sup>i</sup>* (Ai). We also note <sup>Z</sup><sup>k</sup> <sup>=</sup> <sup>A</sup><sup>k</sup> and for i<k, <sup>Z</sup><sup>i</sup> <sup>=</sup> Preσ*<sup>i</sup>* (Zi+1) <sup>∩</sup> <sup>A</sup>i. Then <sup>π</sup> is not feasible if, and only if, Postσ1...σ*<sup>k</sup>* (C1) = <sup>∅</sup>, or equivalently Preσ1...σ*<sup>k</sup>* (Ak) <sup>∩</sup> <sup>C</sup><sup>1</sup> <sup>=</sup> <sup>∅</sup>. Since for all i<k, it holds <sup>C</sup><sup>i</sup> <sup>⊆</sup> <sup>A</sup>i+1, we have that <sup>π</sup> is not feasible if, and only if, <sup>∃</sup><sup>i</sup> <sup>≤</sup> k. Ci <sup>∩</sup> <sup>Z</sup>i <sup>=</sup> <sup>∅</sup>. We illustrate this on Fig. 2.

**Fig. 2.** Spurious counter-example: Z<sup>1</sup> ∩ C<sup>1</sup> = ∅

Let us assume that π is not feasible. Let us denote by i<sup>0</sup> the maximal index such that <sup>C</sup>i<sup>0</sup> <sup>∩</sup> <sup>Z</sup>i<sup>0</sup> <sup>=</sup> <sup>∅</sup>. This index also has the property that for all j<i0, we have <sup>Z</sup><sup>j</sup> <sup>=</sup> <sup>∅</sup> and <sup>Z</sup>i<sup>0</sup> <sup>=</sup> <sup>∅</sup>. Once we have identified this trace as spurious by computing the <sup>Z</sup>j , we have two possibilities:


We can then update the values of <sup>C</sup>i for i>i<sup>0</sup> and repeat the process until we reach an index <sup>j</sup><sup>0</sup> such that <sup>C</sup>j<sup>0</sup> <sup>=</sup> <sup>∅</sup>. We then have modified the nodes <sup>n</sup>i<sup>0</sup> ,...,nj<sup>0</sup> and knowing that <sup>n</sup>j<sup>0</sup> .Z <sup>=</sup> <sup>∅</sup>, we can delete it and all of its descendants. Since some of the descendants of <sup>n</sup>i<sup>0</sup> have not been modified, this might cause some refinements of the first type in the future. In order to ensure termination, we sometimes have to cut a subtree from a node in <sup>n</sup>i<sup>0</sup> ,...,nj0−<sup>1</sup> and reinsert it in the wait list to restart the exploration from there. We call this action cut, and we can use several heuristics to decide when to use it. In the rest of this paper we will use the following heuristics: we perform cut on the first node of <sup>n</sup>i<sup>0</sup> ...nj<sup>0</sup> that is covered by some other node. Since this node is covered, we know that we will not restart the exploration from this node, or that the

node was covered by one of its descendant. If none of these nodes are covered, we delete <sup>n</sup>j<sup>0</sup> and its descendants. Other heuristics are possible, for instance applying cut on <sup>n</sup>i<sup>0</sup> . We found that the above heuristics was the most efficient in our experiments.

**Lemma 4.** *Pick a node* n*, and let* Y = n.Z*. Then after running* Refine*, either node* n *is deleted, or it holds* n.Z ⊆ Y *. In other words, the zone of a node can only be reduced by* Refine*.*

It follows that Refine also preserves Property (1), so that:

**Lemma 5.** *Algorithm 1 satisfies Property* (1)*.*

We can then prove that our algorithm correctly decides the reachability problem and always terminates.

**Theorem 6.** *Algorithm 1 terminates and is correct.*

#### **4 Symbolic Algorithm**

#### **4.1 Boolean Encoding of Zones**

We now present a symbolic algorithm that represents abstract states using Boolean formulas. Let <sup>B</sup> <sup>=</sup> {0, <sup>1</sup>}, and <sup>V</sup> be a set of variables. A Boolean formula f that uses variables from set X ⊆ V will be written f(X) to make the dependency explicit; we sometimes write f(X, Y ) in place of f(X ∪ Y ). Such a formula represents a set [[f]] = {<sup>v</sup> <sup>∈</sup> <sup>B</sup><sup>V</sup> <sup>|</sup> <sup>v</sup> <sup>|</sup><sup>=</sup> <sup>f</sup>}. We consider primed versions of all variables; this will allow us to write formulas relating two valuations. For any subset X ⊆ V, we define X = {p | p ∈ X}.

A *literal* is either p or ¬p for a variable p. Given a set X of variables, an X*minterm* is the conjunction of literals where each variable of X appears exactly once. X-minterms can be seen as elements of BX. Given a vector of Boolean formulas <sup>Y</sup> = (Yx)x∈X, formula <sup>f</sup>[Y /X] is the *substitution of* <sup>X</sup> *by* <sup>Y</sup> *in* <sup>f</sup>, obtained by replacing each <sup>x</sup> <sup>∈</sup> <sup>X</sup> with the formula <sup>Y</sup>x. The positive cofactor of f(X) by x is ∃x. (x ∧ f(X)), and its negative cofactor is ∃x. (¬x ∧ f(X)).

Let us define a generic operator post that computes successors of a set S(X, Y ) given a relation R(X, X ) (here, Y designates any set of variables on which <sup>S</sup> might depend outside of <sup>X</sup>): postR(S(X, Y )) = (∃X.S(X, Y ) <sup>∧</sup> R(X, X ))[X/X ]. Similarly, we set preR(S(X, Y )) = (∃X .S(X, Y )[X /X] ∧ R(X, X )), which computes the predecessors of S(X, Y ) by the relation R [24].

*Clock Predicate Abstraction.* We fix a total order on C0. In this section, abstract domains are defined as <sup>D</sup> = (Dx,y)xy∈C<sup>0</sup> , that is only for pairs xy. In fact, constraints of the form x − y ≤ k with xy are encoded using the negation of <sup>y</sup> <sup>−</sup> x < <sup>−</sup><sup>k</sup> since (<sup>x</sup> <sup>−</sup> <sup>y</sup> <sup>≤</sup> <sup>k</sup>) ⇔ ¬(<sup>y</sup> <sup>−</sup> x < <sup>−</sup>k). We thus define <sup>D</sup>x,y <sup>=</sup> −Dy,x for all xy.

For x, y ∈ C0, let <sup>P</sup>x,y denote the set of *clock predicates associated to* <sup>D</sup>x,y:

$$\mathcal{P}\_{x,y}^{\mathcal{D}} = \{ P\_{x-y \prec k} \mid (k, \prec) \in \mathcal{D}\_{x,y} \}.$$

Let <sup>P</sup><sup>D</sup> <sup>=</sup> <sup>∪</sup>x,y∈C0Px,y denote the set of all clock predicates associated with D (we may omit the superscript D when it is clear). For all (x, y) ∈ C0 <sup>2</sup> and (k, <sup>≺</sup>) ∈ Dx,y, we denote by <sup>p</sup>x−y≺k the literal <sup>P</sup>x−y≺k if xy, and <sup>¬</sup>Py−x≺−1−k otherwise (where <sup>≤</sup>−<sup>1</sup> <sup>=</sup> <sup>&</sup>lt; and <sup>&</sup>lt;−<sup>1</sup> <sup>=</sup> <sup>≤</sup>). We also consider a set B of Boolean variables used to encode locations. Overall, the state space is described using Boolean formulas on these two types of variables, so states are elements of BP∪B.

Our Boolean encoding of clock constraints and semantic operations follow those of [28] for a concrete domain. We define these however for abstract domains, and show how successor computation and refinement operations can be performed.

Let us define the *clock semantics* of predicate <sup>P</sup>x−y<sup>k</sup> as [[Px−yk]]<sup>C</sup><sup>0</sup> <sup>=</sup> {<sup>ν</sup> <sup>∈</sup> <sup>R</sup><sup>C</sup><sup>0</sup> <sup>≥</sup><sup>0</sup> <sup>|</sup> <sup>ν</sup>(x) <sup>−</sup> <sup>ν</sup>(y) <sup>k</sup>}. Since the set <sup>C</sup> of clocks is fixed, we may omit the subscript and just write [[Px−yk]]. We define the conjunction, disjunction, and negation as intersection, union, and complement, respectively. Given a Pminterm <sup>v</sup> <sup>∈</sup> <sup>B</sup><sup>P</sup> , we define [[v]]<sup>D</sup> <sup>=</sup> - p s.t. v(p)[[p]]<sup>D</sup> <sup>∩</sup>- p s.t. <sup>¬</sup>v(p)[[p]]<sup>c</sup> D. Thus, negation of a predicate encodes its complement. For a Boolean formula F(P), we set [[F]] = v∈Minterms(F )[[v]]D. Intuitively, the minterms of <sup>P</sup> define smallest zones of R<sup>C</sup> <sup>≥</sup><sup>0</sup> definable using <sup>P</sup>. A minterm <sup>v</sup> <sup>∈</sup> <sup>B</sup>B∪P defines a pair [[v]]<sup>D</sup> = (l, Z) where l is encoded by v|B and Z = [[v|P ]]D. A Boolean formula F on B∪P defines a set [[F]]<sup>D</sup> <sup>=</sup> <sup>∪</sup>v∈Minterms(F )[[v]]<sup>D</sup> of such pairs. A minterm <sup>v</sup> is *satisfiable* if [[v]]<sup>D</sup> = ∅.

An abstract domain <sup>D</sup> induces an *abstraction function* <sup>α</sup><sup>D</sup> : 2<sup>R</sup><sup>C</sup> <sup>≥</sup><sup>0</sup> <sup>→</sup> <sup>2</sup><sup>B</sup><sup>P</sup> with <sup>α</sup>D(Z) = {<sup>v</sup> <sup>|</sup> <sup>v</sup> <sup>∈</sup> <sup>B</sup><sup>P</sup> and [[v]]<sup>D</sup> <sup>∩</sup> <sup>Z</sup> <sup>=</sup> ∅}, from the set of zones to the set of subsets of Boolean valuations on P. We define the *concretization function* as [[·]]<sup>D</sup> : 2<sup>B</sup><sup>P</sup> <sup>→</sup> <sup>2</sup><sup>R</sup><sup>C</sup> <sup>≥</sup><sup>0</sup> . The pair (αD, [[·]]D) is a Galois connection, and [[αD(Z)]]<sup>D</sup> is the most precise abstraction of Z in the domain induced by D. Notice that α<sup>D</sup> is non-convex in general: for instance, if the clock predicates are x ≤ 2, y ≤ 2, then the set defined by the constraint <sup>x</sup> <sup>=</sup> <sup>y</sup> maps to (px≤<sup>2</sup> <sup>∧</sup>py≤<sup>2</sup>)∨(¬px≤<sup>2</sup> ∧ ¬py≤<sup>2</sup>).

#### **4.2 Reduction and Successor Computation**

We now define the reduction operation, which is similar to the reduction of DBMs. The idea is to eliminate unsatisfiable minterms from a given Boolean formula. For example, we would like to make sure that in all minterms, if <sup>p</sup>x−y≤<sup>1</sup> holds, then so does <sup>p</sup>x−y≤<sup>2</sup>, when both are available predicates. Another issue is to eliminate minterms that are unsatisfiable due to triangle inequality. This is similar to the shortest path computation used to turn DBMs in canonical form.

*Example 1.* Given predicates <sup>P</sup> <sup>=</sup> {px−y≤<sup>1</sup>, py−z≤<sup>1</sup>, px−z≤<sup>2</sup>}, the formula <sup>p</sup>x−y≤<sup>1</sup> <sup>∧</sup> <sup>p</sup>y−z≤<sup>1</sup> is not reduced since it contains the unsatisfiable minterm <sup>p</sup>x−y≤<sup>1</sup> <sup>∧</sup> <sup>p</sup>y−z≤<sup>1</sup> ∧ ¬px−z≤<sup>2</sup>. However, the same formula is reduced if <sup>P</sup> <sup>=</sup> {px−y≤<sup>1</sup>, py−z≤<sup>1</sup>}.

In this paper, we use limited reduction, since reductions are the most expensive operations in our algorithms. The following formula corresponds to 2 reduction, which intuitively amounts to applying shortest paths for paths of lengths 1 and 2:

$$\bigwedge\_{\substack{(x,y)\in\mathcal{O}\_{0}\\(k,\prec)\in\mathcal{D}\_{x,y}}} \left[ p\_{x-y\prec k} \leftarrow \left( \bigvee\_{\substack{(l\_{1},\prec\_{1})\in\mathcal{D}\_{x,y}\\(l\_{1},\prec\_{1})\leq(k,\prec)}} p\_{x-y\prec\_{1}l\_{1}} \lor \bigvee\_{\substack{z\in\mathcal{C}\_{0},(l\_{1},\prec\_{1})\in\mathcal{D}\_{x,z},\\(l\_{2},\prec\_{2})\in\mathcal{D}\_{x,y}\\(l\_{1},\prec\_{1})+(l\_{2},\prec\_{2})\leq(k,\prec)}} p\_{x-y\prec l'} \lor p\_{z-y\prec'l'} \right) \right]$$

**Lemma 7.** *For all formulas* <sup>S</sup>(P)*, we have* [[S]]<sup>D</sup> = [[reduce<sup>2</sup> <sup>D</sup>(S)]]<sup>D</sup> *and all minterms of* reduce<sup>2</sup> <sup>D</sup>(S) *are* 2*-reduced.*

Since 2-reduction des not consider shortest paths of all lengths, there are, in general, 2-reduced unsatisfiable minterms. Nevertheless, any abstraction can be refined so that the updated 2-reduction eliminates a given unsatisfiable minterm:

**Lemma 8.** *Let* <sup>v</sup> <sup>∈</sup> <sup>B</sup>P<sup>D</sup> *be a minterm such that* <sup>v</sup> <sup>|</sup><sup>=</sup> reduce<sup>2</sup> <sup>D</sup> *and* [[v]] = ∅*. One can compute in polynomial time a refinement* D ⊃ D *such that* v |= reduce<sup>2</sup> D *.*

We now explain how successor computation is realized in our encoding. For a guard g, assume we have computed an abstraction αD(g) in the present abstract domain. For each transition <sup>σ</sup> = (1, g, R, 2), let us define the formula <sup>T</sup>σ <sup>=</sup> <sup>1</sup> ∧αD(g). We show how each basic operation on zones can be computed in our BDD encoding. In our algorithm, all formulas A(B,P) representing sets of states are assumed to be reduced, that is, <sup>A</sup>(B,P) <sup>⊆</sup> reduce<sup>2</sup> <sup>D</sup>(A(B,P)).

The intersection operation is simply logical conjunction:

**Lemma 9.** *For all reduced formulas* A(P) *and* B(P)*, we have* A(P) ∧ B(P) = αD([[A(P)]]<sup>D</sup> ∩ [[B(P)]]D)*.*

For the time successors, we define Up(A(B,P)) = reduce(postSUp (A(B,P))) where

$$S\mathfrak{u}\_{\mathfrak{p}} = \bigwedge\_{\substack{x \in \mathcal{C} \\ (k, \prec) \in \mathcal{D}\_{x,0}}} (\neg p\_{x-0 \prec k} \to \neg p'\_{x-0 \prec k}) \bigwedge\_{\substack{x, y \in \mathcal{C}\_{0}, x \neq 0 \\ (k, \prec) \in \mathcal{D}\_{x,y}}} (p'\_{x-y \prec k} \leftrightarrow p\_{x-y \prec k}) .$$

**Lemma 10.** *For any Boolean formula* A(B,P)*,* αD([[A]]↑) ⊆ Up(A)*. Moreover, if* D *is the concrete domain and* A *is reduced, then this holds with equality.*

Following similar ideas, we handle clock resets by defining Resetz(A) = reduce(postSReset*<sup>z</sup>* (A)), for a (complex) relation <sup>S</sup>Reset*<sup>z</sup>* to encode how predicates evolve (see the long version [27] of this article for more detailled explanations). We get:

**Lemma 11.** *For any Boolean formula* A(B,P)*, and any clock* z ∈ C*, we have* <sup>α</sup>D(Resetz([[A]]D)) <sup>⊆</sup> Resetz(A)*. Moreover, if* <sup>D</sup> *is the concrete domain, and* <sup>A</sup> *is reduced, then the above holds with equality.*

**Algorithm 3.** Algorithm SymReach that checks the reachability of a target location <sup>l</sup>T in a given abstract domain <sup>D</sup>.

```
Input: A = (L, Inv, -
                       0, C, E), -
                                T , D
 1 ;
 2 next := enc(l0) ∧ αD(∧x∈Cx = 0);
 3 layers := [];
 4 reachable := false;
 5 while (¬reachable ∧ next) = false do
 6 reachable := reachable ∨ next;
 7 next := ApplyEdges(Up(next)) ∧ ¬reachable;
 8 layers.push(next);
 9 if (next ∧ enc(lT )) = false then
10 return ExtractTrace (layers);
11 return Not reachable;
```
#### **4.3 Model-Checking Algorithm**

Algorithm 3 shows how to check the reachability of a target location given an abstract domain. The list layers contains, at position i, the set of states that are reachable in i steps. The function ApplyEdges computes the disjunction of immediate successors by all edges. It consists in looping over all edges e = (l1, g, R, l2), and gathering the following image by e:

enc(2) <sup>∧</sup> Resetr*<sup>k</sup>* (Resetr*k*−<sup>1</sup> (...(Resetr<sup>1</sup> ((((∃B.A(B,P) <sup>∧</sup> enc(1)) <sup>∧</sup> <sup>α</sup>D(g))))))),

where <sup>R</sup> <sup>=</sup> {r1,...,rk}. We thus use a partitioned transition relation and do not compute the monolithic transition relation.

When the target location is found to be reachable, ExtractTrace(layers) returns a trace reaching the target location. This is standard and can be done by computing backwards from the last element of layers, by finding which edge can be applied to reach the current state. Since both reset and time successor operations are defined using relations, predecessors in our abstract system can be easily computed using the operator preR. As it is standard, we omit the precise definition of this function (the reader can refer to the implementation) but assume that it returns a trace of the form A<sup>1</sup> <sup>σ</sup><sup>1</sup> −→ <sup>A</sup><sup>2</sup> <sup>σ</sup><sup>2</sup> −→ ... <sup>σ</sup>*n*−<sup>1</sup> −−−→ <sup>A</sup>n, where the <sup>A</sup>i(B,P) are minterms and the <sup>σ</sup>i belong to the trace alphabet <sup>Σ</sup> <sup>=</sup> {up, r∅}∪{r(x)}x∈C, with the following meaning:

– if <sup>A</sup>i up −→ <sup>A</sup>i+1 then <sup>A</sup>i+1 <sup>=</sup> Up(Ai); – if <sup>A</sup>i r∅ −→ <sup>A</sup>i+1 then <sup>A</sup>i+1 <sup>=</sup> <sup>A</sup>i; – if <sup>A</sup>i <sup>r</sup>(x) −−→ <sup>A</sup>i+1 then <sup>A</sup>i+1 <sup>=</sup> Resetx(Ai).

The feasibility of such a trace is easily checked using DBMs.

The overall algorithm then follows a classical CEGAR scheme. We initialize D by adding the clock constraints that appear syntactically in A, which is often

a good heuristic. We run the reachability check of Algorithm 3. If no trace is found, then the target location is not reachable. If a trace is found, then we check for feasibility. If it is feasible, then the counterexample is confirmed. Otherwise, the trace is spurious and we run the refinement procedure described in the next subsection, and repeat the analysis.

#### **4.4 Abstraction Refinement**

Since we initialize D with all clock constraints appearing in guards, we can assume that all guards are represented exactly in the considered abstractions. Note that the algorithm can be easily extended to the general case; but this simplifies the presentation.

The abstract transition relation we use is not the most precise abstraction of the concrete transition relation. Therefore, it is possible to have abstract transitions A<sup>1</sup> a −→ A<sup>2</sup> for some action a while no concrete transition exists between [[A1]] and [[A2]]. This requires care and is not a direct application of the standard refinement technique from [11]. A second difficulty is due to incomplete reduction of the predicates using reduce<sup>2</sup> <sup>D</sup>. In fact, some reachable states in our abstract model will be unsatisfiable. Let us explain how we refine the abstraction in each of these cases.

Consider an algorithm interp which returns an interpolant of given zones Z1, Z2. In what follows, by the *refinement of* D *by* interp(Z1, Z2), we mean the domain <sup>D</sup> obtained by adding (k, <sup>≺</sup>) to <sup>D</sup>x,y for all constraints <sup>x</sup> <sup>−</sup> <sup>y</sup> <sup>≺</sup> <sup>k</sup> of interp(Z1, Z2). Observe that αD (Z1) ∩ αD (Z2) = ∅ in this case.

We define concrete successor and predecessor operations for the actions in Σ. For each a ∈ Σ, let Pre<sup>c</sup> a denote the concrete predecessor operation on zones defined straightforwardly, and similarly for Post<sup>c</sup> a.

Consider domain D and the induced abstraction function αD. Assume that we are given a spurious trace π = A<sup>1</sup> <sup>σ</sup><sup>1</sup> −→ <sup>A</sup><sup>2</sup> <sup>σ</sup><sup>1</sup> −→ ... <sup>σ</sup>*n*−<sup>1</sup> −−−→ <sup>A</sup>n. Let <sup>B</sup><sup>1</sup> ...Bn be the sequence of concrete states visited along π in A, that is, B<sup>1</sup> is the concrete initial state, and for all 2 <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>n</sup>, let <sup>B</sup>i <sup>=</sup> Post<sup>c</sup> <sup>π</sup>*i*−<sup>1</sup> (Bi−<sup>1</sup>). This sequence can be computed using DBMs.

The trace is *realizable* if <sup>B</sup>n <sup>=</sup> <sup>∅</sup>, in which case the counterexample is confirmed. Otherwise it is *spurious*. We show how to refine the abstraction to eliminate a spurious trace π.

Let <sup>i</sup><sup>0</sup> be the maximal index such that <sup>B</sup>i<sup>0</sup> <sup>=</sup> <sup>∅</sup>. There are three possible reasons explaining why <sup>B</sup>i0+1 is empty:

	- (a) if <sup>π</sup>i<sup>0</sup> = up: refine <sup>D</sup> by interp([[Ai<sup>0</sup> ]]↑, [[Ai0+1]]↓).
	- (b) if <sup>π</sup>i<sup>0</sup> <sup>=</sup> <sup>r</sup>(x): refine <sup>D</sup> by interp(Freex([[Ai<sup>0</sup> ]]), Freex([[Ai0+1]])).

Note that the case <sup>π</sup>i<sup>0</sup> <sup>=</sup> <sup>r</sup><sup>∅</sup> is not possible since this induces the identity function both in the abstract and concrete systems.

Given abstraction α<sup>D</sup> and spurious trace π, let refine(αD, π) denote the refined abstraction αD obtained as described above.

The following two lemmas justify the two subcases of the third case above. They prove that the detected spurious transition disappears after refinement. The reset and up operations depend on the abstraction, so we make this dependence explicit below by using superscripts, as in Reset<sup>α</sup> x and Upα, in order to distinguish the operations before and after a refinement.

**Lemma 12.** *Consider* (A1, A2) ∈ Up<sup>α</sup> *with* [[A1]]↑ ∩ [[A2]] = ∅*. Then* [[A1]]↑ ∩ [[A2]]↓ = ∅*. Moreover, if* α *is obtained by refinement of* α *by* interp([[A1]]↑, [[A2]]↓)*, then for all* (A 1, A <sup>2</sup>) <sup>∈</sup> Upα *,* [[A <sup>1</sup>]] ⊆ [[A1]] *implies* [[A <sup>2</sup>]] ∩ [[A2]] = ∅*.*

**Lemma 13.** *Consider* x ∈ C*, and* (A1, A2) ∈ Reset<sup>α</sup> x *such that* [[A1]][<sup>x</sup> <sup>←</sup> 0] <sup>∩</sup> [[A2]] = <sup>∅</sup>*. Then* Freex([[A1]]) <sup>∩</sup> Freex([[A2]]) = <sup>∅</sup>*. Moreover, if* <sup>α</sup> *is obtained by refinement of* <sup>α</sup> *by* interp(Freex([[A1]]), Freex([[A2]]))*, then for all* (A 1, A <sup>2</sup>) ∈ Resetα x *with* [[A <sup>1</sup>]] ⊆ [[A1]]*, we have* [[A <sup>2</sup>]] ∩ [[A2]] = ∅*.*

#### **5 Experiments**

We implemented both algorithms. The symbolic version was implemented in OCaml using the CUDD library<sup>2</sup>; the explicit version was implemented in C++ within an existing model checker using Uppaal DBM library. Both prototypes

<sup>2</sup> http://vlsi.colorado.edu/∼fabio/.

take as input networks of timed automata with invariants, discrete variables, urgent and committed locations. The presented algorithms are adapted to these features without difficulty.

We evaluated our algorithms on three classes of benchmarks we believe are significant. We compare the performance of the algorithm with that of Uppaal [7] which is based on zones, as well as the BDD-based model checker engine of PAT [25]. We were unable to compare with RED [30] which is not maintained anymore and not open source, and with which we failed to obtain correct results. The tool used in [16] was not available either. We thus only provide a comparison here with two well-maintained tools.

Two of our benchmarks are variants of schedulability-analysis problems where task execution times depend on the internal states of executed processes, so that an analysis of the state space is necessary to obtain a precise answer.

**Monoprocess Scheduling Analysis.** In this variant, a single process sequentially executes tasks on a single machine, and the execution time of each cycle depends on the state of the process. The goal is to determine a bound on the maximum execution time of a single cycle. This depends on the semantics of the process since the bound depends on the reachable states.

More precisely, we built a set of benchmarks where the processes are defined by synchronous circuit models taken from the Synthesis Competition (http:// www.syntcomp.org). We assume that each latch of the circuit is associated with a resource, and changing the state of the resource takes some amount of time. So a subset of the latches have clocks associated with them, which measure the time elapsed since the latest value change (latest moment when the value changed from 0 to 1, or from 1 to 0). We provide two time positive bounds <sup>0</sup> and <sup>1</sup> for each latch, which determine the execution time as follows: if the value of latch changes from 0 to 1 (resp. from 1 to 0), then the execution time of the present cycle cannot be less than <sup>1</sup> (resp. 0). The execution time of the step is then the minimum that satisfies these constraints.

**Multi-process Stateful Scheduling Analysis.** In this variant, three processes are scheduled on two machines with a round-robin policy. Processes schedule tasks one after the other without any delay. As in the previous benchmarks, a process executing a task (on any machine) corresponds to a step of the synchronous circuit model. Each task is described by a tuple (C1, C2, D) which defines the minimum and maximum execution times, and the relative deadline. When a task finishes, the next task arrives immediately. The values in the tuple depend on the state of the process. The goal is to check the absence of any deadline miss. Processes are also instantiated with AIG circuits from http://www. syntcomp.org.

**Asynchronous Computation.** We consider an asynchronous network of "threshold gates", defined as follows: each gate is characterized by a tuple (n, θ, [l, u]) where n is the number of inputs, 0 ≤ θ ≤ n is the threshold, and l ≤ u are lower and upper bounds on activation time. Each gate has an output which is initially undefined. The gate becomes active during the time period [l, u]. During this time, if all inputs are defined, and if at least θ of the inputs have value 1, then it sets its output to 1. At the end of the time period, it becomes deactivated and the output becomes undefined again, until the next period, which starts l time units after the deactivation. The goal is to check whether the given gate can output 1 within a given time bound T.

**Results.** Figure 3 displays the results of our experiments. All algorithms were given 8 GB of memory and a timeout of 30 min, and the experiments were run on laptop with an Intel i7@3.2 Ghz processor running Linux. The symbolic algorithm performs best among all on the monoprocess and multiprocess scheduling benchmarks. Uppaal is the second best, but does not solve as many benchmarks as our algorithm. Our enumerative algorithm quickly fails on these benchmarks, often running out of memory. On asynchronous computation benchmarks, our enumerative algorithm performs remarkably well, beating all other algorithms. We ran our tools on the CSMA/CD benchmarks (with 3 to 12 processes); Uppaal performs the best but our enumerative algorithm is slightly behind. The symbolic algorithm does not scale, while PAT fails to terminate in all cases.

The tool used for the symbolic algorithm is open source and can be found at https://github.com/osankur/symrob along with all the benchmarks.

**Fig. 3.** Comparison of our enumerative and symbolic algorithms (referred to as Absenumerative and Abs-symbolic) with Uppaal and PAT. Each figure is a cactus plot for the set of benchmarks: a point (X, Y ) means X benchmarks were solved within time bound Y .

#### **6 Conclusion and Future Work**

There are several ways to improve the algorithm. Since the choice of interpolants determines the abstraction function and the number of refinements, we assumed that taking the minimal interpolant should be preferable as it should keep the abstractions as coarse as possible. But it might be better to predict which interpolant is the most adapted for the rest of the computation in order to limit future refinements. The number of refinement also depends on the search order, and although it has already been studied in [23], it could be interesting to study it in this case. Generally speaking, it is worth noting that we currently cannot predict which (variant of) our algorithms is better suited for which model.

Several extensions of our algorithms could be developed, *e.g.* combining our algorithms with other methods based on finer abstractions as in [22], integrating predicate abstraction on discrete variables, or developing SAT-based versions of our algorithms.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### Fast Algorithms for Handling Diagonal Constraints in Timed Automata

Paul Gastin<sup>1</sup> , Sayan Mukherjee<sup>2</sup> , and B. Srivathsan2(B)

<sup>1</sup> LSV, ENS Paris-Saclay, CNRS, Université Paris-Saclay, Cachan, France paul.gastin@lsv.fr <sup>2</sup> Chennai Mathematical Institute, Chennai, India

{sayanm,sri}@cmi.ac.in

Abstract. A popular method for solving reachability in timed automata proceeds by enumerating reachable sets of valuations represented as zones. A naïve enumeration of zones does not terminate. Various termination mechanisms have been studied over the years. Coming up with efficient termination mechanisms has been remarkably more challenging when the automaton has diagonal constraints in guards.

In this paper, we propose a new termination mechanism for timed automata with diagonal constraints based on a new simulation relation between zones. Experiments with an implementation of this simulation show significant gains over existing methods.

Keywords: Timed automata · Diagonal constraints · Reachability · Zones · Simulations

#### 1 Introduction

Timed automata have emerged as a popular model for systems with real-time constraints [2]. Timed automata are finite automata extended with real-valued variables called *clocks*. All clocks are assumed to start at 0, and increase at the same rate. Transitions of the automaton can make use of these clocks to disallow behaviours which violate timing constraints. This is achieved by making use of *guards* which are constraints of the form x ≤ 5, x − y ≥ 3, y > 7, etc. where x, y are clocks. A transition guarded by x ≤ 5 says that it can be fired only when the value of clock x is ≤ 5. Another important feature is the *reset* of clocks in transitions. Each transition can specify a subset of clocks whose values become 0 once the transition is fired. The combination of guards and resets allows to track timing distance between events. A basic question that forms the core of timed automata technology is *reachability*: given a timed automaton, does there

This work is supported by UMI Relax. The first author is partly supported by ANR project TickTac (ANR-18-CE40-0015) and third author by CEFIPRA project IoTTTA (Indo-French program in ICST-DST/CNRS ref. 2016-01). The second and third authors are partly supported by Infosys Foundation (India) and Tata Consultancy Services - Innovation Labs (Pune, India).

exist an execution from its initial state to a final state. This question is known to be decidable [2]. Various algorithms for this problem have been studied over the years and have been implemented in tools [6,21,26,28,31,32].

Since the clocks are real valued variables, the space of configurations of a timed automaton (consisting of a state and a valuation of the clocks) is infinite and an explicit enumeration is not possible. The earliest solution to reachability was to partition this space into a finite number of *regions* and build a region graph that provides a finite abstraction of the behaviour of the timed automaton [2]. However, this solution was not practical. Subsequent works introduced the use of *zones* [14]. Zones are special sets of clock valuations with efficient data structures and manipulation algorithms [6]. Within zone based algorithms, there is a division: forward analysis versus backward analysis. The current industry strength tool UPPAAL [28] implements a forward analysis approach, as this works better in the presence of other discrete data structures used in UPPAAL models [9]. We focus on this forward analysis approach using zones in this paper.

The forward analysis of a timed automaton essentially enumerates sets of reachable configurations stored as zones. Some extra care needs to be taken for this enumeration to terminate. Traditional development of timed automata made use of *extrapolation* operators over zones to ensure termination. These are functions which map a zone to a bigger zone. Importantly, the range of these functions is finite. The goal was to come up with extrapolation operators which are sound: adding these extra valuations should not lead to new behaviours. This is where the role of *simulations* between configurations was studied and extrapolation operators based on such simulations were devised [14]. A certain extrapolation operation, which is now known as Extra<sup>M</sup> [5] was proposed and reachability using Extra<sup>M</sup> was implemented in tools [14].

A seminal paper by Bouyer [9] revealed that Extra<sup>M</sup> is not correct in the presence of *diagonal constraints* in guards. These are constraints of the form x − yc where is either < or ≤, and c is an integer. Moreover, it was proved that no such extrapolation operation would be correct when there are diagonal constraints present. It was shown that for automata without diagonal constraints (henceforth referred to as diagonal-free automata), the extrapolation works. After this result, developments in timed automata reachability focussed on the class of diagonal-free automata [4,5,23,24], and diagonal constraints were mostly sidelined. All these developments have led to quite efficient algorithms for diagonal-free timed automata.

Diagonal constraints are a useful modeling feature and occur naturally in certain problems, especially scheduling [3,17,20,27] and logic-automata translations [16,25], also in [29]. It is however known that they do not add any expressive power: every timed automaton can be converted into a diagonal-free timed automaton [7]. This conversion suffers from an exponential blowup, which was later shown to be unavoidable: diagonal constraints could potentially give exponentially more succinct models [10]. Therefore, a good forward analysis algorithm that works directly on a timed automaton with diagonal constraints would be handy. This is the subject of this paper.

*Related Work.* The first attempt at such an algorithm was to split the (extrapolated) zones with respect to the diagonal constraints present in the automaton [6]. This gave a correct procedure, but since zones are split, an enumeration starts from each small zone leading to an exponential blow-up in the number of visited zones. A second attempt was to do a more refined conversion into a diagonal free automaton by detecting "relevant" diagonals [13,30] in an iterative manner. In order to do this, special data structures storing sets of sets of diagonal constraints were utilized. In [18] we extended the works [5] and [23] on diagonalfree automata to the case of diagonal constraints. All the approaches suffer from either a space or time bottleneck and are incomparable to the efficiency and scalability of tools for diagonal-free automata.

*Our Contributions.* The goal of this paper is to come up with fast algorithms for handling diagonal constraints. Since the extrapolation based approach is a dead end, we work with simulation between zones directly, as in [23] and [18]. We propose a new simulation relation between zones that is correct in the presence of diagonal constraints (Sect. 3). We give an algorithm to test this simulation between zones (Sect. 4). We have incorporated this simulation test in (an older version of) the tool TChecker [21] checking reachability for timed automata, and compared our results with the state-of-the-art tool UPPAAL. Experiments show an encouraging gain, both in the number of zones enumerated and in the time taken by the algorithm, sometimes upto four orders of magnitude (Sect. 6). The main advantage of our approach is that it does not split zones, and furthermore it leverages the optimizations studied for diagonal-free automata.

From a technical point of view, our presentation does not make use of regions and instead works with valuations, zones and simulation relations. We think that this presentation provides a clearer perspective - as a justification of this claim, we extend our simulation to timed automata with general updates of the form x := c and x := y + d in transitions (where x, y are clocks and c, d are constants) in a rather natural manner (Sect. 5). In general, reachability for timed automata with updates is undecidable [12]. Some decidable cases have been proposed for which the algorithms are based on regions. For decidable subclasses containing diagonal constraints, no zone based approach has been studied. Our proposed method includes these classes, and also benefits from zones and standard optimizations studied for diagonal-free automata.

Missing proofs can be found in the full version of this paper [19].

#### 2 Preliminaries

Let <sup>N</sup> be the set of natural numbers, <sup>R</sup>≥<sup>0</sup> the set of non-negative reals and <sup>Z</sup> the set of integers. Let <sup>X</sup> be a finite set of variables ranging over <sup>R</sup>≥<sup>0</sup>, called *clocks*. Let Φ(X) denote the set of constraints ϕ formed using the following grammar: ϕ := xc | cx | x − y<sup>d</sup> <sup>|</sup> <sup>ϕ</sup> <sup>∧</sup> <sup>ϕ</sup>, where x, y <sup>∈</sup> <sup>X</sup>, <sup>c</sup> <sup>∈</sup> <sup>N</sup>, <sup>d</sup> <sup>∈</sup> <sup>Z</sup> and - ∈ {<, ≤}. Constraints of the form xc and cx are called *non-diagonal constraints* and those of the form x − yc are called *diagonal constraints*. We have adopted a convention that in non-diagonal constraints xc and cx, the constant c is restricted to N. A *clock valuation* v is a function which maps every clock <sup>x</sup> <sup>∈</sup> <sup>X</sup> to a real number <sup>v</sup>(x) <sup>∈</sup> <sup>R</sup>≥0. A valuation is said to satisfy a guard g, written as v |= g if replacing every x in g with v(x) makes the constraint <sup>g</sup> true. For <sup>δ</sup> <sup>∈</sup> <sup>R</sup>≥<sup>0</sup> we write <sup>v</sup> <sup>+</sup> <sup>δ</sup> for the valuation which maps every <sup>x</sup> to v(x) + δ. Given a subset of clocks R ⊆ X, we write [R]v for the valuation which maps each x ∈ R to 0 and each x ∈ R to v(x).

A *timed automaton* A is a tuple (Q, X, q0,T,F) where Q is a finite set of states, X is a finite set of clocks, q<sup>0</sup> ∈ Q is the initial state, F ⊆ Q is a set of accepting states and <sup>T</sup> <sup>∈</sup> <sup>Q</sup> <sup>×</sup> <sup>Φ</sup>(X) <sup>×</sup> <sup>2</sup><sup>X</sup> <sup>×</sup> <sup>Q</sup> is a set of transitions. Each transition t ∈ T is of the form (q, g, R, q ) where q and q are respectively the source and target states, g is a constraint called the *guard*, and R is a set of clocks which are *reset* in t. We call a timed automaton *diagonal-free* if guards in transitions do not use diagonal constraints.

A *configuration* of A is a pair (q, v) where q ∈ Q and v is a valuation. The semantics of a timed automaton is given by a transition system S<sup>A</sup> whose states are the configurations of A. Transitions in S<sup>A</sup> are of two kinds: *delay* transitions are given by (q, v) <sup>δ</sup> −→ (q, v + δ) for all δ ≥ 0, and *action* transitions are given by (q, v) <sup>t</sup> −→ (q , v ) for each t := (q, g, R, q ), if <sup>v</sup> <sup>|</sup><sup>=</sup> <sup>g</sup> and <sup>v</sup> = [R]v. We write δ,t −→ for a sequence of delay δ followed by action t. A run of A is an alternating sequence of delay-action transitions starting from the initial state q<sup>0</sup> and the initial valuation **<sup>0</sup>** which maps every clock to <sup>0</sup>: (q0, **<sup>0</sup>**) <sup>δ</sup>0,t<sup>0</sup> −−−→ (q1, v1) <sup>δ</sup>1,t<sup>1</sup> −−−→ ···(qn, vn). A run of the above form is said to be accepting if the last state q<sup>n</sup> ∈ F. The *reachability problem* for timed automata is the following: given an automaton A, decide if there exists an accepting run. This problem is known to be PSPACE-complete [2]. Since the semantics S<sup>A</sup> is infinite, solutions to the reachability problem work with a finite abstraction of S<sup>A</sup> that is sound and complete. Before we explain one of the popular solutions to reachability, we state a result which allows to convert every timed automaton into a diagonal-free timed automaton.

Theorem 1. *[7] For every timed automaton* A*, there exists a diagonal-free timed automaton* Adf *s.t. there is a bijection between runs of* A *and* Adf *. The number of states in* <sup>A</sup>df *is* <sup>2</sup><sup>d</sup> · <sup>n</sup> *where* <sup>d</sup> *is the number of diagonal constraints and* n *is the number of states of* A*.*

The above theorem allows to solve the reachability of a timed automaton A by first converting it into the diagonal free automaton Adf and then checking reachability on Adf . However, this conversion comes with a systematic exponential blowup (in terms of the number of diagonal constraints present in A). It was shown in [10] that such a blowup is unavoidable in general. We will now recall the general algorithm for analyzing timed automata, and then move into specific details which depend on whether the automaton has diagonal constraints or not.

Zones and Simulations. Fix a timed automaton A with clock set X for the rest of the discussion in this section. As the space of valuations of A is infinite, algorithms work with sets of valuations called *zones*. A zone is set of clock valuations given by a conjunction of constraints of the form x − yc, xc and c<sup>x</sup> where <sup>c</sup> <sup>∈</sup> <sup>Z</sup> and - ∈ {<, ≤}, for example the solutions of x−y < 5∧y ≤ 10 is a zone. The transition relation over configurations (q, v) is extended to (q,Z) where Z is a zone. We define the following operations on zones given a guard g and a set of clocks <sup>R</sup>: time elapse −→<sup>Z</sup> <sup>=</sup> {<sup>v</sup> <sup>+</sup> <sup>δ</sup> <sup>|</sup> <sup>v</sup> <sup>∈</sup> Z, δ <sup>≥</sup> <sup>0</sup>}; guard intersection Z∧g := {v | v ∈ Z and v |= g} and reset [R]Z := {[R]v | v ∈ Z}. It can be shown that all these operations result in zones. Zones can be efficiently represented and manipulated using Difference Bound Matrices (DBMs) [15].

The *zone graph* ZG(A) of timed automaton A is a transition system whose nodes are of the form (q,Z) where q is a state of A and Z is a zone. For each transition t := (q, g, R, q ) of A, and each zone (q,Z) there is a transition (q,Z) <sup>⇒</sup><sup>t</sup> (q , Z ) where <sup>Z</sup> <sup>=</sup> −−−−−−−→ [R](<sup>Z</sup> <sup>∧</sup> <sup>g</sup>). The initial node is (q0, Z0) where q<sup>0</sup> is the initial state of A and Z<sup>0</sup> = {**0** + δ | δ ≥ 0} is the zone obtained by elapsing an arbitrary delay from the initial valuation. A path in the zone graph is a sequence (q0, Z0) <sup>⇒</sup><sup>t</sup><sup>0</sup> (q1, Z1) <sup>⇒</sup><sup>t</sup><sup>1</sup> ··· ⇒<sup>t</sup>n−<sup>1</sup> (qn, Zn) starting from the initial node. The path is said to be accepting if q<sup>n</sup> is an accepting state. The zone graph is known to be sound and complete for reachability.

#### Theorem 2. *[14]* A *has an accepting run iff* ZG(A) *has an accepting path.*

This does not yet give an algorithm as the zone graph ZG(A) is still not finite. Moreover, there are examples of automata for which the reachable part of ZG(A) is also infinite: starting from the initial node, applying the successor computation leads to infinitely many zones. Two different approaches have been studied to get finiteness, both of them based on the usage of *simulation relations*.

A (time-abstract) simulation relation (-) between configurations of A is a reflexive and transitive relation such that (q, v) - (q , v ) implies q = q and (1) for every δ ≥ 0, there exists δ ≥ 0 such that (q, v + δ) - (q, v + δ ) and (2) for every transition <sup>t</sup> of <sup>A</sup>, if (q, v) <sup>t</sup> −→ (q1, v1) then (q, v ) t −→ (q1, v <sup>1</sup>) such that (q1, v1) - (q1, v 1).

We say v v , read as v is simulated by v if (q, v) - (q, v ) for all states q. The simulation relation can be extended to zones: Z - Z if for every v ∈ Z there exists v ∈ Z such that v v . We write ↓Z for {v | ∃v ∈ Z s.t. v v }. The simulation relation is said to be finite if the function mapping zones Z to the down sets ↓Z has finite range. We now recall a specific simulation relation -LU [5,23]. Current algorithms and tools for diagonal-free automata are based on this simulation. The conditions required for v -LU v ensure that when all lower bound constraints cx satisfy c ≤ L(x) and all upper bound constraints xc satisfy c ≤ U(x), whenever v satisfies a constraint, v will also satisfy it.

Definition 1 (LU-bounds and the relation -LU [5,23]). *An* LU*-bounds function is a pair of functions* <sup>L</sup> : <sup>X</sup> <sup>→</sup> <sup>N</sup> ∪ {−∞} *and* <sup>U</sup> : <sup>X</sup> <sup>→</sup> <sup>N</sup> ∪ {−∞} *that map each clock to either a non-negative constant or* −∞*. Given an* LU*-bounds function, we define* v -LU v *for valuations* v, v *if for every clock* x ∈ X*:*

$$v'(x) < v(x) \text{ implies } L(x) < v'(x) \quad \text{and} \quad v(x) < v'(x) \text{ implies } U(x) < v(x).$$

Reachability in Diagonal-Free Timed Automata. A natural method to get finiteness of the zone graph is to prune the zone graph computation through simulations Z - Z : do not explore a node (q,Z) if there is an already visited node (q,Z ) such that Z - Z . Since these simulation tests need to be done often during the zone graph computation, an efficient algorithm for performing this test is crucial. Note that Z - Z iff Z ⊆ ↓Z . However, it is known that the set ↓Z is not necessarily a zone (this was proved for ↓LU Z in [5]), and hence no simple zone inclusions are applicable. The first algorithms for timed automata followed a different approach, which we call the *extrapolation* approach. In this approach, whenever a new zone Z is discovered by the algorithm, a new zone Extra(Z)(⊇ Z) gets computed and stored in the place of Z.

*Reachability Algorithm Using Zone Extrapolation.* The input to the algorithm is a timed automaton A. The algorithm maintains two lists, Passed and Waiting. Initially, the node (q0, Extra(Z0)) is added to the Waiting list (recall that (q0, Z0) is the initial node of the zone graph ZG(A)). Wlog. we assume that q<sup>0</sup> is not accepting. The algorithm repeatedly performs the following steps:


Several extrapolation operators (ExtraM, ExtraLU , Extra<sup>+</sup> LU ) were introduced in [5]. The function Extra<sup>+</sup> LU has nice properties - (1) Extra<sup>+</sup> LU (Z) ⊆ ↓LU Z and (2) Extra<sup>+</sup> LU (Z) is a zone for all Z. These properties give an algorithm that performs only efficient zone operations: successor computations and zone inclusions.

*Reachability Algorithm Using Simulations.* The initial node (q0, Z0) is added to the Waiting list. Wlog. we assume that q<sup>0</sup> is not accepting. The algorithm repeatedly performs the following steps:


An O(|X| <sup>2</sup>) algorithm for Z -LU Z was proposed in [23]. The efficiency of this simulation check makes it well suited for use in practice. Moreover, as Extra<sup>+</sup> LU (Z) ⊆ ↓LU Z, we expect to get more simulations (and hence quicker termination) through -LU .

Reachability in the Presence of Diagonal Constraints. The -LU relation is no longer a simulation when diagonal constraints are present. Moreover, it was shown in [9] that no extrapolation operator (along the lines of Extra<sup>+</sup> LU ) can work in the presence of diagonal constraints. The first option to deal with diagonals is to use Theorem 1 to get a diagonal free automaton and then apply the methods discussed previously. One problem with this is the systematic exponential blowup introduced in the number of states of the resulting automaton. Another problem is to get diagnostic information: counterexamples need to be translated back to the original automaton [6]. Various methods have been studied to circumvent the diagonal free conversion and instead work on the automaton with diagonal constraints directly. We recall the approach used in the state-of-the-art tool UPPAAL below.

*Zone Splitting* [6]. The paper introducing timed automata gave a notion of equivalence between valuations v <sup>M</sup> v parameterized by a function M mapping each clock x to the maximum constant M among the guards of the automaton that involve x. This equivalence is a finite simulation for diagonal-free automata. Equivalence classes of <sup>M</sup> are called regions. This was extended to the diagonal case by [6] as: <sup>v</sup> <sup>d</sup> <sup>M</sup> v if v <sup>M</sup> v and for all diagonal constraints g present in the automaton, if <sup>v</sup> <sup>|</sup><sup>=</sup> <sup>g</sup> then <sup>v</sup> <sup>|</sup><sup>=</sup> <sup>g</sup>. The <sup>d</sup> <sup>M</sup> relation splits the regions further, such that each region is either entirely included inside g, or entirely outside g for each g. The next step is to use this notion of equivalence in zones. The paper [6] follows the extrapolation approach: to each zone Z, an extrapolation operation ExtraM(Z) is applied; this adds some valuations which are <sup>M</sup> equivalent to valuations in Z; then it is further split into multiple zones, so that each small zone is either inside g or outside g for each diagonal constraint g. If d is the number of diagonal constraints present in the automaton, this splitting process can give rise to 2<sup>d</sup> zones for each zone Z. From each small zone, the zone graph computation is started. Essentially, the exponential blow-up at the state level which appeared in the diagonal-free conversion now appears in the zone level.

In this paper, we propose a new simulation to handle diagonal constraints. This has two advantages - using this avoids the blow-up in the number of nodes arising due to zone splitting, and the simulation test between zones has an efficient implementation and is significantly quicker than the simulation of [18].

#### 3 A New Simulation Relation

We start with a definition of a relation between timed automata configurations, which in some sense "declares" upfront what we need out of a simulation relation that can be used in a reachability algorithm. As we proceed, we will make its description more concrete and give an effective simulation algorithm between zones, that can be implemented. Fix a clock set X. This generates constraints Φ(X).

Definition 2 (the relation <sup>G</sup> ). *Let* G *be a (finite or infinite) set of constraints. We say* v <sup>G</sup> v *if for all* ϕ ∈ G *and all* δ ≥ 0*,* v + δ |= ϕ *implies* v + δ |= ϕ*.*

Our goal is to utilize the above relation in a simulation (as defined in p. xx) for a timed automaton. Directly from the definition, we get the following lemma which shows that the <sup>G</sup> relation is preserved under time elapse.

Lemma 1. *If* v <sup>G</sup> v *, then* v + δ <sup>G</sup> v + δ *for all* δ ≥ 0*.*

The other kind of transformation over valuations is resets. Given sets of guards G1, G and a set of clocks R, we want to find conditions on G<sup>1</sup> and G so that if v G<sup>1</sup> v then [R]v <sup>G</sup> [R]v . To do this, we need to answer this question: what guarantees should we ensure for v, v (via G1) so that [R]v <sup>G</sup> [R]v . This motivates the next definition.

Definition 3 (weakest pre-condition of <sup>G</sup> over resets). *For a constraint* ϕ *and a set of clocks* R*, we define a set of constraints* wp(ϕ, R) *as follows: when* ϕ *is of the form* xc *or* cx*, then* wp(ϕ, R) *is empty if* x ∈ R *and is* {ϕ} *otherwise; when* ϕ *is a diagonal constraint* x − yc*, then* wp(ϕ, R) *is:*

*–* {x − yc} *if* {x, y} ∩ R = ∅ *–* {xc} *if* y ∈ R*,* x ∈ R *and* c ≥ 0 *–* {−cy} *if* x ∈ R*,* y ∈ R *and* −c ≥ 0

*– empty, otherwise.*

*For a set of guards* G*, we define* wp(<sup>G</sup> , R) := - <sup>ϕ</sup>∈G wp(ϕ, R)*.*

Note that the relation <sup>G</sup> is parameterized by a set of constraints. Additionally, we desire this set to be finite, so that the relation can be used in an algorithm. We need to first link an automaton A with such a set of constraints. One way to do it is to take the set of all guards present in the automaton and to close it under weakest pre-conditions with respect to all possible subsets of clocks. A better approach is to consider a set of constraints for each state, as in [4] where the parameters for extrapolation (the maximum constants appearing in guards) are calculated at each state.

Definition 4 (State based guards). *Let* A = (Q, X, q0,T,F) *be a timed automaton. We associate a set of guards* G(q) *for each state* q ∈ Q*, which is the least set of guards (for the coordinate-wise subset inclusion order) such that for every transition* (q, g, R, q1)*: the guard* g *and the set* wp(G(q1), R) *are present in* G(q)*. More precisely,* {G(q)}<sup>q</sup>∈<sup>Q</sup> *is the least solution to the following set of equations written for each* q ∈ Q*:*

$$\mathcal{G}(q) = \bigcup\_{(q,g,R,q\_1)\in T} \{g\} \cup \operatorname{wp}(\underline{\sqsubset}\_{\mathcal{O}(q\_1)}, R)$$

All constraints present in the set wp(G(q1), R) contain constants which are already present in G(q1). The least solution to the above set of equations can therefore be obtained by a fixed point computation which starts with G(q) set to - (q,g,R,q1)∈<sup>T</sup> {g} and then repeatedly updates the weakest-preconditions. Since no new constants are generated in this process, the fixed point computation terminates. We now have the ingredients to define a simulation relation over configurations of a timed automaton with diagonal constraints.

Definition 5 (A-simulation). *Let* A = (Q, X, q0,T,F) *be a timed automaton and let the set of guards* G(q) *of Definition 4 be associated to every state* q ∈ Q*. We define a relation* -<sup>A</sup> *between configurations of* A *as* (q, v) -<sup>A</sup> (q, v ) *if* v G(q) v *.*

Lemma 2. *The relation* -<sup>A</sup> *is a simulation on the configurations of timed automaton* A*.*

As pointed before, Definition 2 gives a declarative description of the simulation and it is unclear how to work with it algorithmically, even when the set of constraints G is finite. The main issue is with the ∀δ quantification, which is not finite. We will first provide a characterization that brings out the fact that this ∀δ quantification is irrelevant for diagonal constraints (essentially because value of v(x) − v(y) does not change with time elapse). Given a set of constraints G, let G<sup>−</sup> ⊆ G be the set of non-diagonal constraints in G.

Proposition 1. v <sup>G</sup> v *iff* v G<sup>−</sup> v *and for all diagonal constraints* ϕ ∈ G*, if* v |= ϕ *then* v |= ϕ*.*

It now amounts to solving the ∀δ problem for non-diagonals. It turns out that the -LU simulation achieves this, almost. We will see this in more detail in the next section.

#### 4 Algorithm for *Z -<sup>G</sup> Z-*

Fix a finite set of guards G. Restating the definition of <sup>G</sup> extended to zones: Z <sup>G</sup> Z if for all v ∈ Z there exists a v ∈ Z such that v <sup>G</sup> v . In this section, we will view the characterization of <sup>G</sup> as in Proposition 1 and give an algorithm to check Z <sup>G</sup> Z that uses as an oracle a test Z G<sup>−</sup> Z . We discuss the computation of Z G<sup>−</sup> Z later in this section. We start with an observation following from Proposition 1.

Lemma 3. *Let* ϕ := x − yc *be a diagonal constraint in* G*. Then* Z <sup>G</sup> Z *if and only if* Z ∩ ϕ G- Z ∩ ϕ *and* Z ∩ ¬ϕ G- Z *where* G = G\{ϕ}*. If* G *has no diagonal constraints,* Z <sup>G</sup> Z *if and only if* Z G<sup>−</sup> Z *.*

This leads to the following algorithm consisting of two mutually recursive procedures. This algorithm is essentially an implementation of the above lemma, with two optimizations:


Computing *Z -<sup>G</sup><sup>−</sup> Z-* . We will use -LU to approximate G<sup>−</sup> : in our implementation of the above algorithms, we replace Z G<sup>−</sup> Z with Z -LU Z . This works because for an appropriate choice of LU (explained below), we have Z -LU(G) Z ⇒ Z G<sup>−</sup> Z . The converse is not true as the LU bounds functions cannot distinguish between guards with < and ≤ comparisons. Therefore, the -LU simulation does not characterize v G<sup>−</sup> v completely. Although we are aware of the (rather technical) modifications to -LU simulation that are needed for this characterization, we choose to use the existing -LU directly as it is safe to do so and it has already been implemented in tools. This gives us a finer simulation than v G<sup>−</sup> v .

Definition 6 (LU-bounds from G). *Let* G *be a finite set of constraints. We define* LU(G) *to denote the pair of functions* L<sup>G</sup> *and* U<sup>G</sup> *defined as follows:*

<sup>L</sup>G(x) = −∞ *if there is no guard of the form* cx *in* G max{c | cx ∈ G} *otherwise* <sup>U</sup>G(x) = −∞ *if there is no guard of the form* xc *in* G max{c | xc ∈ G} *otherwise*

Lemma 4. *For every set of constraints* G*,* v -LU(G) v *implies* v G<sup>−</sup> v *.*

The above observations call for the next definition and subsequent lemmas.

Definition 7 (approximating <sup>G</sup> ). *Let* G *be a finite set of constraints. We define a relation* LU <sup>G</sup> *as follows:* <sup>v</sup> LU <sup>G</sup> v *if* v -LU(G) v *and for all diagonal constraints* ϕ ∈ G*, if* v |= ϕ *then* v |= ϕ*. Similarly, define* -LU <sup>A</sup> *as* (q, v) -LU A (q, v ) *if* <sup>v</sup> LU <sup>G</sup>(q) v *.*

Lemma 5. *The relation* -LU <sup>A</sup> *is a finite simulation on the configurations of* A*.*

The above lemma and the fact that Z -LU(G) Z can be checked in O(|X| 2) [23,33], imply the following theorem.

Theorem 3. *When using* Z -LU(G) Z *in the place of* Z G<sup>−</sup> Z *, the algorithm is correct and it terminates in* <sup>O</sup>(2<sup>d</sup> · |X<sup>|</sup> <sup>2</sup>) *where* d *is the number of diagonal guards in* G*.*

From a complexity viewpoint, this algorithm is not efficient since it makes an exponential number of calls in the number of diagonal constraints (in fact this may not be avoidable due to Lemma 6, which follows from the NP-hardness result in [18]). Although the above algorithm does involve many calls, the internal operations involved in each call are simple zone manipulations. Moreover, the preliminary checks (for instance line 6 of Algorithm 1) cut short the number of calls. This is visible in our experiments which are very good, especially with respect to running time, as compared to other methods. A similar hardness was shown for a different simulation in [18], but the implementation there indeed witnessed the hardness, as the time taken by that algorithm was unsatisfactory.

Lemma 6. *Deciding* <sup>Z</sup> LU <sup>G</sup> Z *is NP-complete.*

#### 5 Simulations for Updatable Timed Automata

In the timed automata considered so far, clocks are allowed to be reset to 0 along transitions. We consider in this section more sophisticated transformations to clocks in transitions. These are called *updates*. An update up : R|X<sup>|</sup> <sup>≥</sup><sup>0</sup> <sup>→</sup> <sup>R</sup>|X<sup>|</sup> is a function mapping non-negative |X|-dimensional reals (valuations) v to general |X|-dimensional reals (which may apriori not be valuations as the coordinates may be negative). The syntax of the update function up is given by a set of atomic updates up<sup>x</sup> to each x ∈ X, which are of the form x := c or x := y + d where <sup>c</sup> <sup>∈</sup> <sup>N</sup>, <sup>d</sup> <sup>∈</sup> <sup>Z</sup> and <sup>y</sup> <sup>∈</sup> <sup>X</sup> (possibly equal to <sup>x</sup>). Note that we want <sup>d</sup> to be an integer, since we allow for decrementing clocks, and on the other hand <sup>c</sup> <sup>∈</sup> <sup>N</sup> since we have non-negative clocks. Given a valuation v and an update up, the valuation up(v) is:

$$up(v)(x) := \begin{cases} c & \text{if } up\_x \text{ is } x := c \\ v(y) + d & \text{if } up\_x \text{ is } x := y + d \end{cases}$$

Note that in general, due to the presence of updates x := y+d, the update up(v) may not yield a clock valuation. However, when it does give a valuation, it can be used as a transformation in timed automata transitions. We say up(v) ≥ 0 if up(v)(x) ≥ 0 for all clocks x ∈ X.

An *updateable timed automaton (UTA)* A = (Q, X, q0,T,F) is an extension of a classic timed automaton with transitions of the form (q, g, up, q ) where up is an update. Semantics extend in the natural way: delay transitions remain the same, and for action transitions t := (q, g, up, q ) we have (q, v) <sup>t</sup> −→ (q , v ) if v |= g, up(v) ≥ 0, and v = up(v). We allow the transition only if the update results in a valuation. The reachability problem for these automata is known to be undecidable in general [12]. Various subclasses with decidable reachability have been discussed in the same paper. Decidability proofs in [12] take the following flavour, for a given automaton A: (1) divide the space of all valuations into a finite number of equivalence classes called *regions* (2) to build the parameters for the equivalence, derive a set of diophantine equations from the guards of A; if they have a solution then construct the quotient graph of the equivalence (called region graph) parameterized by the obtained solution and check reachability on it; if the equations have no solution, output that reachability for A cannot be answered. Sufficient conditions on the nature of the updates that give a solution to the diophantine equations have been tabulated in [12]. When the automaton is diagonal-free, the "region-equivalence" can be used to build an extrapolation operation which in turn can be used in a reachability algorithm with zones. When the automaton contains diagonals, the region-equivalence is used to only build a region graph - no effective zone based approach has been studied.

We use a similar idea, but we have two fundamental differences: (1) we want to obtain reachability through the use of simulations on zones, and (2) we build equations over sets of guards as in Definition 4. The advantage of this approach is that this allows the use of coarser simulations over zones. Even for automata with diagonal constraints and updates, we get a zone based algorithm, instead of resorting to regions which are not efficient in practice.

The notion of simulations as in p. xx remains the same, now using the semantics of transitions with updates. We will re-use the simulation relation <sup>G</sup> . We need to extend Definition 3 to incorporate updates. We do this below. Here is a notation: for an update function up, we write up(x) to be c if up<sup>x</sup> is x := c, and up(x) to be y + c if up<sup>x</sup> is x := y + c.

#### Definition 8 (weakest pre-condition of <sup>G</sup> over updates).

*Let* up *be an update.*

*For a constraint* ϕ *of the form* xc *or* cx*, we define* wp(ϕ, up) *to be respectively* {up(x) c} *or* {c up(x)} *if these resulting constraints are of the form* zd *or* dz *with* z ∈ X *and* d ≥ 0*, otherwise* wp(ϕ, up) *is empty.*

*For a constraint* ϕ : x−yc*, we define* wp(ϕ, up) *to be* {up(x)−up(y) c} *if this constraint is either a diagonal using different clocks, or it is of the form* zd *or* dz *with* d ≥ 0*, otherwise* wp(ϕ, up) *is empty.*

*For a set of guards* G*, we define* wp(<sup>G</sup> , up) := - <sup>ϕ</sup>∈G wp(ϕ, up)*.*

Some examples: wp(x ≤ 5, x := x + 10) is empty, since up(x) is x + 10, and the guard x + 10 ≤ 5 is not satisfiable; wp(x ≤ 5, x := x − 10) is x ≤ 15, wp(x ≤ 5, x := c) is empty, wp(x−y ≤ 5,x := z1, y := z2+10) will be z1−(z2+10) ≤ 5, giving the constraint z<sup>1</sup> − z<sup>2</sup> ≤ 15, wp(x − y ≤ 5,x := z + c1, y := z + c2) is empty, wp(x − y ≤ 5,x := c1, y := z + c2) is c = c<sup>1</sup> − 5 − c<sup>2</sup> ≤ z if c ≥ 0 and is empty otherwise.

Definition 9 (State based guards). *Let* A = (Q, X, q0,T,F) *be a UTA. We associate a set of constraints* G(q) *for each state* q ∈ Q*, which is the least set of constraints (for the coordinate-wise subset inclusion order) such that for* *every transition* (q, g, up, q1)*: the guard* g *and the set* wp(G(q1), up) *are present in* G(q)*, and in addition constraints that allow the update to happen are also present in* G*. The last condition is given by the weakest precondition of the set of constraints* {x ≥ 0 | x ∈ X}*. Overall,* {G(q)}q∈<sup>Q</sup> *is the least solution to the following set of equations, for each* q ∈ Q*:*

$$\mathcal{G}(q) = \bigcup\_{(q,g,up,q\_1)\in T} \left( \{g\} \cup \text{wp}(\square\_{\{x\geq 0 \mid x\in X\}},up) \cup \text{wp}(\square\_{\mathcal{O}(q\_1)},up) \right)$$

*The least solution* {G(q)}<sup>q</sup>∈<sup>Q</sup> *is said to be finite if each* G(q) *is a finite set of constraints.*

In contrast to the simple reset case, the above set of equations may not have a finite solution. Consider a self-looping transition: (q, x c, x := x − 1, q). We require xc ∈ G(q). Now, wp(x c, x := x − 1) is xc + 1 which should be in G(q) according to the above equation. Continuing this process, we need to add xd for every natural number d ≥ c. Indeed this is consistent with the undecidability of reachability when subtraction updates are allowed. We deal with the subject of finite solutions to the above equations later in this section. On the other hand, when the above system does have a solution with finite G(q) at every q, we can use the A simulation of Definition 5 and its approximation -LU <sup>A</sup> to get an algorithm.

Proposition 2. *Let* A = (Q, X, q0,T,F) *be a UTA. Let* {G(q)}<sup>q</sup>∈<sup>Q</sup> *be the least solution to the equations given in Definition 9. Then, the relation* -<sup>A</sup> *is a simulation on the configurations of* A*.*

Lemma 7. *For a UTA* A*, assume that the least solution* {G(q)}<sup>q</sup>∈<sup>Q</sup> *to the statebased guards equations is finite. Then the relation* -LU <sup>A</sup> *is a finite simulation on the configurations of* A*.*

Finite Solution to the State-Based Guards Equations. The least solution to the equations of Definition 9 can be obtained by a standard Kleene iteration for fixed points computation. For each i ≥ 0 and each state q, define:

$$\begin{aligned} \mathcal{G}^0(q) &= \bigcup\_{\{q,g,up,q'\} \in T} \{g\} \cup \operatorname{wp}(\underline{\sqsubset}\_{\{x \ge 0 \mid x \in X\}}, up) \\ \mathcal{G}^{i+1}(q) &= \bigcup\_{\{q,g,up,q'\} \in T} \mathcal{G}^i(q) \cup \operatorname{wp}(\underline{\sqsubset}\_{\mathcal{G}^i(q')}, up) \end{aligned}$$

The iteration stabilizes when there exists a <sup>k</sup> satisfying <sup>G</sup><sup>k</sup>+1(q) = <sup>G</sup><sup>k</sup>(q) for all <sup>q</sup>. At stabilization, the values <sup>G</sup><sup>k</sup>(q) satisfy the equations of Definition 9, and give the required G(q). However, as we mentioned earlier, this iteration might not stabilize at any k. We will now develop some observations that will help detect after finitely many steps if the iteration will stabilize or not.

Suppose we colour the set <sup>G</sup><sup>i</sup>+1(q) to *red* if either there exists a diagonal constraint x − y<sup>c</sup> ∈ G<sup>i</sup>+1(q) \ G<sup>i</sup> (q) (a new diagonal is added) or there exists a non-diagonal constraint xc or c<sup>x</sup> in <sup>G</sup>i+1(q) \ G<sup>i</sup> (q) such that the constant c is strictly bigger than c for respectively every non-diagonal xc or c x in <sup>G</sup><sup>i</sup> (q) (a non-diagonal with a bigger constant is added). If this condition is not applicable, we colour the set <sup>G</sup>i+1(q) *green*. The next observations say that the iteration terminates iff we reach a stage where all sets are green. Intuitively, once we reach green, the only constraints that can be added are non-diagonals having smaller (non-negative) constants and hence the procedure terminates.

Lemma 8. *Let* i > <sup>0</sup>*. If* <sup>G</sup><sup>i</sup> (q) *is green for all* <sup>q</sup>*, then* <sup>G</sup><sup>i</sup>+1(q) *is green for all* <sup>q</sup>*.*

Lemma 9. *Let* <sup>K</sup> = 1+|Q|·|X|·(|X|+ 1)*. If there is a state* <sup>p</sup> *such that* <sup>G</sup><sup>K</sup>(p) *is red, then there is no* <sup>i</sup> *such that* <sup>G</sup><sup>i</sup> (q) *is green for all* q*.*

As to why the bound K =1+ |Q|·|X| ·(|X| + 1) in the lemma above: a red state at stage i arises due to the addition of a constraint ϕ<sup>i</sup> at state pi, which in turn depends on a state p<sup>i</sup>−<sup>1</sup> marked red at stage i−1 due to constraint ϕ<sup>i</sup>−<sup>1</sup>. If we iterate sufficiently long, we will hit a state p, a sequence of transitions from p to p and a constraint ϕ such that computing the weakest precondition over this loop will give a new constraint with the same set of clocks as ϕ but with a different constant. This part can be iterated infinitely often.

Proposition 3. *The least solution of the local constraint equations for a UTA is finite iff* <sup>G</sup><sup>K</sup>(q) *is green for all* <sup>q</sup> *and where* <sup>K</sup> =1+ <sup>|</sup>Q|·|X| · (|X<sup>|</sup> + 1)*.*

Theorem 4. *Let* A *be a UTA. It is decidable whether the equations in Definition 9 have a finite solution. When these equations do have a finite solution, zone graph enumeration using* -LU <sup>A</sup> *is a sound, complete and terminating procedure for the reachability problem.*

All decidable classes of [12] can be shown decidable with our approach, by showing stabilization of the G(q) computation.

Lemma 10. *Reachability is decidable in UTA where: guards are non-diagonals and updates are of the form* x := c*,* x := y*,* x := y + c *where* c ≥ 0 *or, guards include diagonal constraints and updates are of the form* x := c*,* x := y*.*

#### 6 Experiments

We have implemented the reachability algorithm for timed automata with diagonal constraints (and only resets as updates) based on the simulation approach (p. xx) using the -LU <sup>A</sup> simulation (Definition 7) for pruning zones. The algorithm for <sup>Z</sup> LU <sup>G</sup> Z comes from Sect. 4. Experiments are reported in Table 1. We take model *Cex* from [8,30] and *Fischer* from [30]. We are not aware of any other "standard" benchmarks containing diagonal constraints. In addition to these two models, we introduce a new benchmark. This is an extension of the job-shop scheduling using (diagonal-free) timed automata [1]. Here the tasks within a job were logically independent. We add some timing dependency between them

Table 1. Experiments: the column #<sup>D</sup> gives the number of diagonal constraints. Four methods have been reported in the table. First two methods, TChecker with our simulation relation -LU <sup>G</sup> and UPPAAL engine for diagonals, have been run on A, the automata containing diagonal constraints. Whereas, the third and fourth methods are running diagonal-free engines of UPPAAL and TChecker on A*df* , a diagonal-free equivalent of A. Experiments were run on macOS X with 2.3 GHz Intel core i5 processor, and 8 GB RAM. Time is reported in seconds. We set a timeout of 15 min.


which gets naturally modeled using diagonal constraints. Each model considered above is a product of a number of k timed automata. In the table we write the name of the model and the number k of automata involved in the product. We also report the number of diagonal constraints in each of them.

*Experimental Results.* We report the results of four methods of handling diagonal constraints, as mentioned in the caption of Table 1. Under each method, we report on the number of zones enumerated and the time taken. The first method gives a huge gain over the second one (upto four orders of magnitude in the number of nodes, and even better for time) and gives a less marked, but still significant, gain over the third and fourth methods. We provide a brief explanation of this phenomenon. The performance of the reachability algorithm is dependent on three factors:


Algorithm of column 1 uses the superior heuristic in all the three optimizations above. The no-splitting-of-zones was possible thanks to our simulation approach, which temporarily splits zones for checking <sup>Z</sup> LU <sup>G</sup> Z , but never starts a new exploration from any of the split nodes. The algorithm of column 2, which is implemented in the current version UPPAAL 4.1 uses the inferior heuristic in all the three above. In particular, it is not clear how the extrapolation approach can avoid the zone splitting in an efficient manner. The superiority of our approach gets amplified (by multiplicative factors) when we consider bigger products with many more diagonals. In the third and fourth methods, we give a diagonal free equivalent of the original model (c.f. Theorem 1) and use the UPPAAL and TChecker engines respectively, for diagonal free timed automata. The UPPAAL diagonal free engine is highly optimized, and makes use of the superior heuristics in the first two optimizations mentioned above (the third heuristic is not applicable now as it is a diagonal free automaton). The third and fourth methods can be considered as a good approximation of the zone splitting approach to diagonal constraints using LU-abstractions and local guards.

The second and the third methods are the only possibilities of verifying timed models coming with diagonal constraints in UPPAAL. Both these approaches are in principle prone to a 2#<sup>D</sup> blowup compared to the first approach, where #D gives the number of diagonal constraints. The table shows that a good extent of this blowup indeed happens. The UPPAAL diagonal free engine uses "minimal constraint systems" [6] for representing zones, whereas TChecker uses DBMs [15]. This explains why even with the same number of nodes visited, UPPAAL performs better in terms of time. We have not included in the table the comparison with two other works dealing with the same problem: the refined diagonal free conversion [30] and the extension of LU simulation for diagonals [18]. However, our results are better than the tables reported in these papers.

#### 7 Conclusion

We have proposed a new algorithm for handling diagonal constraints in timed automata, and extended it to automata with general updates. Our approach is based on a simulation relation between zones. From our preliminary experiments, we can infer that the use of simulations is indispensable in the presence of diagonal constraints as zone-splitting can be avoided. Moreover, the fact that the simulation approach stores the actual zones (as opposed to abstracted zones in the extrapolation approach) has enabled optimizations for diagonal-free automata that work with dynamically changing simulation parameters (LUbounds), which are learnt as and when the zones are expanded [22]. Working with actual zones is also convenient for finding cost-optimal paths in priced timed automata [11]. Investigating these in the presence of diagonal constraints is part of future work. Currently, we have not implemented our approach for updateable timed automata. This will also be part of our future work.

Working directly with a model containing diagonal constraints could be convenient (both during modeling, and during extraction of diagnostic traces) and can also potentially give a smaller automaton to begin with. We believe that our experiments provide hope that diagonal constraints can indeed be used.

#### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Safety and Co-safety Comparator Automata for Discounted-Sum Inclusion**

Suguman Bansal(B) and Moshe Y. Vardi

Rice University, Houston, TX 77005, USA sugumanb@gmail.com

**Abstract.** *Discounted-sum inclusion* (DS-inclusion, in short) formalizes the goal of comparing quantitative dimensions of systems such as cost, resource consumption, and the like, when the mode of aggregation for the quantitative dimension is discounted-sum aggregation. *Discounted-sum comparator automata*, or *DS-comparators* in short, are B¨uchi automata that read two infinite sequences of weights synchronously and relate their discounted-sum. Recent empirical investigations have shown that while DS-comparators enable competitive algorithms for DS-inclusion, they still suffer from the scalability bottleneck of B¨uchi operations.

Motivated by the connections between discounted-sum and B¨uchi automata, this paper undertakes an investigation of language-theoretic properties of DS-comparators in order to mitigate the challenges of B¨uchi DS-comparators to achieve improved scalability of DS-inclusion. Our investigation uncovers that DS-comparators possess safety and co-safety language-theoretic properties. As a result, they enable reductions based on subset construction-based methods as opposed to higher complexity B¨uchi complementation, yielding tighter worst-case complexity and improved empirical scalability for DS-inclusion.

#### **1 Introduction**

The analysis of quantitative dimensions of computing systems such as cost, resource consumption, and distance metrics [6,10,28] has been studied thoroughly to design efficient computing systems. Cost-aware program-synthesis [14,16] and low-cost program-repair [25] have found compelling applications in robotics [24, 29], education [22], and the like.*Quantitative verification* facilitates efficient system design by automatically determining if a system implementation is more efficient than a specification model. Investigations in quantitative verification have demonstrated their high computational complexity and practically intractable [17,23]. This work addresses practical intractability of quantitative verification.

At the core of quantitative verification lies the problem of *quantitative inclusion* which formalizes the goal of determining which of two given systems is more efficient [17,23,31]. In quantitative inclusion, quantitative systems are abstracted as weighted automata [7,21,32]. A run in a weighted automaton is associated with a sequence of weights. The quantitative dimension of these runs is determined by the weight of runs, which is computed by taking an aggregate of the run's weight sequence. Quantitative inclusion can be thought of as the quantitative generalization of (qualitative) language inclusion.

A commonly appearing mode of aggregation is that of *Discounted-sum (DS) aggregation* which captures the intuition that weights incurred in the near future are more significant than those incurred later on [19]. The convergence of DS aggregation for all bounded infinite weight-sequences makes it a preferred mode of aggregation across domains: Reinforcement learning [37], planning under uncertainty [34], and game-theory [33]. This work examines the problem of *Discounted-sum inclusion* or *DS-inclusion* that is quantitative inclusion when *discounted sum* is the mode of aggregation.

In theory, DS-inclusion is PSPACE-complete [12]. Recent algorithmic approaches have tapped into language-theoretic properties of discounted-sum aggregate function [12,18] to design practical algorithms for DS-inclusion [11,12]. These algorithms use *DS-comparator automata* (*DS-comparator*, in short) as their main technique, and are *purely* automata-theoretic. While these algorithms outperform other existing approaches for DS-inclusion in runtime [15,17], even these do not scale well on weighted-automata with more than few hundreds of states [11]. This work contributes novel techniques and algorithms for DSinclusion to address the scalability challenge of DS-inclusion

An in-depth examination of the DS-comparator based algorithm exposes their scalability bottleneck. DS-comparator is a B¨uchi automaton that relates the discounted-sum aggregate of two (bounded) weight-sequences A and B by determining the membership of the interleaved pair of sequences (A, B) in the language of the comparator. As a result, DS-comparators reduce DS-inclusion to language inclusion between (non-deterministic) B¨uchi automaton. In spite of the fact that many techniques have been proposed to solve B¨uchi language inclusion efficiently in practice [4,20], none of them can avoid at least an exponential blowup of 2O(<sup>n</sup> log <sup>n</sup>) , for an n-sized input, caused by a direct or indirect involvement of B¨uchi complementation [36,40].

This work meets the scalability challenge of DS-inclusion by delving deeper into language-theoretic properties of discounted-sum aggregate functions [18] in order to obtain algorithms for DS-inclusion that render both tighter theoretical complexity and improved scalability. Specifically, we prove that DS-comparators are expressed as *safety automata* or *co-safety automata* [26] (Sect. 3.1), and have compact deterministic constructions (Sect. 3.2). Safety and co-safety automata have the property that their complementation is performed by simpler and lower 2O(n) -complexity subset-construction methods [27]. As a result, they facilitate a procedure for DS-inclusion that uses subset-construction based intermediate steps instead of B¨uchi complementation, yielding an improvement in theoretical complexity from 2O(n·log <sup>n</sup>) to 2O(n). Our subset-construction based procedure has yet another advantage over B¨uchi complementation as they support efficient on-the-fly implementations, yielding practical scalability as well (Sect. 4).

An empirical evaluation of our prototype tool QuIPFly for the proposed procedure against the prior DS-comparator algorithm and other existing approaches for DS-inclusion shows that QuIPFly outperforms them by orders of magnitude both in runtime and the number of benchmarks solved (Sect. 4).

#### **2 Preliminaries and Related Work**

A weight-sequence, finite or infinite, is *bounded* if the absolute value of all of its elements are bounded by a fixed number.

*B¨uchi Automaton:* <sup>A</sup> *B¨uchi automaton* is a tuple <sup>A</sup> = (*S*, <sup>Σ</sup>, <sup>δ</sup>, <sup>s</sup>I, <sup>F</sup>), where *<sup>S</sup>* is a finite set of *states*, <sup>Σ</sup> is a finite *input alphabet*, <sup>δ</sup> <sup>⊆</sup> (*<sup>S</sup>* <sup>×</sup> <sup>Σ</sup> <sup>×</sup> *<sup>S</sup>*) is the *transition relation*, state <sup>s</sup><sup>I</sup> <sup>∈</sup> *<sup>S</sup>* is the *initial state*, and F ⊆ *<sup>S</sup>* is the set of *accepting states* [39]. A B¨uchi automaton is *deterministic* if for all states s and inputs <sup>a</sup>, |{s <sup>|</sup>(s, a, s ) <sup>∈</sup> <sup>δ</sup> for some <sup>s</sup> }| ≤ 1. Otherwise, it is *nondeterministic*. A B¨uchi automaton is *complete* if for all states <sup>s</sup> and inputs <sup>a</sup>, |{s <sup>|</sup>(s, a, s ) ∈ δ for some s }| ≥ 1. For a word <sup>w</sup> <sup>=</sup> <sup>w</sup>0w<sup>1</sup> ···∈ <sup>Σ</sup><sup>ω</sup>, a *run* <sup>ρ</sup> of <sup>w</sup> is a sequence of states <sup>s</sup>0s<sup>1</sup> ... s.t. <sup>s</sup><sup>0</sup> <sup>=</sup> <sup>s</sup>I, and <sup>τ</sup><sup>i</sup> = (si, wi, s<sup>i</sup>+1) <sup>∈</sup> <sup>δ</sup> for all <sup>i</sup>. Let *inf* (ρ) denote the set of states that occur infinitely often in run ρ. A run ρ is an *accepting run* if *inf* (ρ) ∩ F <sup>=</sup> <sup>∅</sup>. A word <sup>w</sup> is an accepting word if it has an accepting run. The language of B¨uchi automaton A, denoted by L(A) is the set of all words accepted by <sup>A</sup>. By abuse of notation, we write <sup>w</sup> ∈ A and <sup>ρ</sup> ∈ A if <sup>w</sup> and <sup>ρ</sup> are an accepting word and an accepting run of A. B¨uchi automata are closed under set-theoretic union, intersection, and complementation [39].

*Safety and Co-safety Properties:* Let L ⊆ <sup>Σ</sup><sup>ω</sup> be a language over alphabet <sup>Σ</sup>. A finite word <sup>w</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> is a *bad prefix* for <sup>L</sup> if for all infinite words <sup>y</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup>, <sup>x</sup> · y /∈ L. A language <sup>L</sup> is a *safety language* if every word w /∈ L has a bad prefix for L. A language L is a *co-safety language* if its complement language is a safety language [5]. When a safety or co-safety language is an ω-regular language, the B¨uchi automaton representing it is called a safety or co-safety automaton, respectively [26]. Wlog, safety and co-safety automaton contain a *sink state* from which every outgoing transitions loops back to the sink state and there is a transition on every alphabet symbol. All states except the sink state are accepting in a safety automaton, while only the sink state is accepting in a co-safety automaton. Unlike B¨uchi complementation, complementation of safety and co-safety automaton is conducted by simpler subset construction with a lower 2O(n) blow-up. The complementation of safety automaton is a co-safety automaton, and vice-versa. Safety automata are closed under intersection, and co-safety automata are closed under union.

*Comparator Automaton:* For a finite-set of integers Σ, an aggregate function <sup>f</sup> : <sup>Z</sup><sup>ω</sup> <sup>→</sup> <sup>R</sup>, and equality or inequality relation <sup>R</sup> ∈ {<, >, <sup>≤</sup>, <sup>≥</sup>, <sup>=</sup>, =}, the *comparison language for* f *with relation* R is a language of infinite words over the alphabet <sup>Σ</sup> <sup>×</sup> <sup>Σ</sup> that accepts a pair (A, B) iff <sup>f</sup>(A) <sup>R</sup> <sup>f</sup>(B) holds. A *comparator automaton (comparator, in short) for aggregate function* f *and relation* R is an automaton that accepts the comparison language for f with R [12]. A comparator is said to be *regular* if its automaton is a B¨uchi automaton.

*Weighted Automaton:* A *weighted automaton* over infinite words is a tuple <sup>A</sup> = (M, γ,f), where <sup>M</sup> = (*S*, Σ, δ, sI, *<sup>S</sup>*) is a complete B¨uchi automaton with all states as accepting, <sup>γ</sup> : <sup>δ</sup> <sup>→</sup> <sup>N</sup> is a *weight function*, and <sup>f</sup> : <sup>N</sup><sup>ω</sup> <sup>→</sup> <sup>R</sup> is the *aggregate function* [17,31]. *Words* and *runs* in weighted automata are defined as in B¨uchi automata. The *weight-sequence* of run ρ = s0s<sup>1</sup> ... of word w = w0w<sup>1</sup> ... is given by wt<sup>ρ</sup> = n0n1n<sup>2</sup> ... where n<sup>i</sup> = γ(si, wi, si+1) for all i. The *weight of a run* ρ, denoted by f(ρ), is given by f(wtρ). Here the *weight of a word* <sup>w</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup> in weighted automata is defined as wtA(w) = sup{f(ρ)|<sup>ρ</sup> is a run of <sup>w</sup> in A}.

*Quantitative Inclusion:* Let P and Q be weighted automata with the *same* aggregate function. The *strict quantitative inclusion problem*, denoted by <sup>P</sup> <sup>⊂</sup> <sup>Q</sup>, asks whether for all words <sup>w</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup>, wt<sup>P</sup> (w) < wtQ(w). The *non-strict quantitative inclusion problem*, denoted by <sup>P</sup> <sup>⊆</sup> <sup>Q</sup>, asks whether for all words <sup>w</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup>, wt<sup>P</sup> (w) <sup>≤</sup> wtQ(w). *Comparison language or comparator of a quantitative inclusion* problem refer to the comparison language or comparator of the associated aggregate function.

*Discounted-sum Inclusion:* Let A = A0, A1,... be a weight sequence, d > 1 be a rational number. The *discounted-sum* (DS in short) of A with *integer* discountfactor d > 1 is *DS*(A, d) = Σ<sup>∞</sup> i=0 A*i* <sup>d</sup>*<sup>i</sup>* . DS-comparison language and DS-comparator with discount-factor d > 1 are the comparison language and comparator obtained for the discounted-sum aggregate function with discount-factor d > 1, respectively. Strict or non-strict discounted-sum inclusion is strict or non-strict quantitative inclusion with the discounted-sum aggregate function, respectively. For brevity, we abbreviate discounted-sum inclusion to DS-inclusion.

**Related Work.** The decidability of DS-inclusion is an open problem when the discount-factor d > 1 is arbitrary. Recent work has established that DS-inclusion is PSPACE-complete when the discount-factor is an integer [12]. This work investigates algorithmic approaches to DS-inclusion with integer discount-factors.

Two contrasting solution approaches have been identified for DS-inclusion. The first approach is *hybrid* [17]. It separates out the language-theoretic aspects of weighted-automata from the numerical aspects, and solves each separately [15,17]. More specifically, the hybrid approach solves the language-theoretic aspects by DS-determinization [15] and the numerical aspect is performed by linear programming [8,9] sequentially. To the best of our knowledge, this procedure cannot be performed in parallel. As a result, this approach must always incur the exponential cost of DS-determinization.

The second approach is *purely*-automata theoretic [12]. This approach uses regular DS-comparator to reduce DS-inclusion to language inclusion between nondeterministic B¨uchi automata [11,12]. While the purely automata-theoretic approach scales better than the hybrid approach in runtime [11], its scalability suffers from fundamental algorithmic limitations of B¨uchi language inclusion. A key ingredient of B¨uchi language-inclusion is B¨uchi complementation [36]. B¨uchi complementation is 2O(<sup>n</sup> log <sup>n</sup>) in the worst-case, and is practically intractable [40]. These limitations also feature in the theoretical complexity and practical performance of DS-inclusion. The complexity of DS-inclusion between weighted automata P and Q with regular DS-comparator C for integer discount-factor d > 1 is <sup>|</sup>P| · <sup>2</sup>O(|<sup>P</sup> ||Q||C|·log(|<sup>P</sup> ||Q||C|)).

This work improves the worst-case complexity and practical performance of the purely automata theoretic approach for DS-inclusion by a closer investigation of language-theoretic properties of DS-comparators. In particular, we identify that DS-comparator for integer discount-factor form a safety or co-safety automata (depending on the relation R). We show that complementation advantage of safety/co-safety automata not only improves the theoretical complexity of DS-inclusion with integer discount-factor but also facilitate on-the-fly implementations that significantly improve practical performance.

#### **3 DS-inclusion with Integer Discount-Factor**

This section covers the core technical contributions of this paper. We uncover novel language-theoretic properties of DS-comparison languages and utilize them to obtain tighter theoretical upper-bound for DS-inclusion with integer discountfactor. Unless mentioned otherwise, the discount-factor is an integer.

In Sect. 3.1 we prove that DS-comparison languages are either safety or co-safety for all rational discount-factors. Since DS-comparison languages are ωregular for integer discount-factors [12], we obtain that DS-comparators for integer discount-factors form safety or co-safety automata. Next, Sect. 3.2 makes use of newly obtained safety/co-safety properties of DS-comparator to present the first deterministic constructions for DS-comparators. These deterministic construction are compact in the sense that they match their non-deterministic counterparts in number of states [11]. Section 3.3 evaluates the complexity of quantitative inclusion with regular safety/co-safety comparators, and observes that its complexity is lower than the complexity for quantitative inclusion with regular comparators. Finally, since DS-comparators are regular safety/co-safety, our analysis shows that the complexity of DS-inclusion is improved as a consequence of the complexity observed for quantitative-inclusion with regular safety/cosafety comparators.

We begin with formal definitions of safety/co-safety comparison languages and safety/co-safety comparators:

**Definition 1 (Safety and co-safety comparison languages).** *Let* Σ *be a finite set of integers,* <sup>f</sup> : <sup>Z</sup><sup>ω</sup> <sup>→</sup> <sup>R</sup> *be an aggregate function, and* <sup>R</sup> ∈ {≤, < , <sup>≥</sup>, >, <sup>=</sup>, =} *be a relation. A comparison language* <sup>L</sup> *over* <sup>Σ</sup> <sup>×</sup> <sup>Σ</sup> *for aggregate function* f *and relation* R *is said to be a safety comparison language (or a cosafety comparison language) if* L *is a safety language (or a co-safety language).*

**Definition 2 (Safety and co-safety comparators).** *Let* Σ *be a finite set of integers,* <sup>f</sup> : <sup>Z</sup><sup>ω</sup> <sup>→</sup> <sup>R</sup> *be an aggregate function, and* <sup>R</sup> ∈ {≤, <, <sup>≥</sup>, >, <sup>=</sup>, =} *be a relation. A comparator for aggregate function* f *and relation* R *is a* safety comparator *(or* co-safety comparator*) is the comparison language for* f *and* R *is a safety language (or co-safety language).*

A safety comparator is *regular* if its language is ω-regular (equivalently, if its automaton is a safety automaton). Likewise, a co-safety comparator is *regular* if its language is ω-regular (equivalently, automaton is a co-safety automaton).

By complementation duality of safety and co-safety languages, comparison language for an aggregate function <sup>f</sup> for non-strict inequality <sup>≤</sup> is safety iff the comparison language for f for strict inequality < is co-safety. Since safety languages and safety automata are closed under intersection, safety comparison languages and regular safety comparator for non-strict inequality renders the same for equality. Similarly, since co-safety languages and co-safety automata are closed under union, co-safety comparison languages and regular co-safety comparators for non-strict inequality render the same for the inequality relation. Therefore, it suffices to examine the comparison language for one relation only.

It is worth noting that for weight-sequences A and B and all relations R, we have that *DS*(A, d) <sup>R</sup> *DS*(B, d) iff *DS*(<sup>A</sup> <sup>−</sup> B, d) <sup>R</sup> 0, where (<sup>A</sup> <sup>−</sup> <sup>B</sup>)<sup>i</sup> <sup>=</sup> <sup>A</sup><sup>i</sup> <sup>−</sup> <sup>B</sup><sup>i</sup> for all <sup>i</sup> <sup>≥</sup> 0. Prior work [11] shows that we can define *DS-comparison language* with upper bound μ, discount-factor d > 1, and relation R to accept infinite and bounded weight-sequence <sup>C</sup> over {−μ, . . . , μ} iff *DS*(C, d) <sup>R</sup> 0 holds. Similarly, DS-comparator with the same parameters μ, d > 1, accepts the DScomparison language with parameters μ, d and R. We adopt these definitions for DS-comparison languages and DS-comparators

Throughout this section, the concatenation of finite sequence x with finite or infinite sequence <sup>y</sup> is denoted by <sup>x</sup> · <sup>y</sup> in the following.

#### **3.1 DS-comparison Languages and Their Safety/Co-safety Properties**

The central result of this section is that DS-comparison languages are safety or co-safety languages for all (integer and non-integer) discount-factors (Theorem 1). In particular, since DS-comparison languages are ω-regular for integer discount-factors [12], this implies that DS-comparators for integer discountfactors form safety or co-safety automata (Corollary 1).

The argument for safety/co-safety of DS-comparison languages depends on the property that the discounted-sum aggregate of all bounded weight-sequences exists for all discount-factors d > 1 [35].

#### **Theorem 1.** *Let* μ > 1 *be the upper bound. For rational discount-factor* d > 1


*Proof (Proof sketch).* Due to duality of safety/co-safety languages, it suffices to show that DS-comparison language with ≤ is a safety language.

Let DS-comparison language with upper bound μ, rational discount-factor d > 1 and relation <sup>≤</sup> be denoted by <sup>L</sup>μ,d <sup>≤</sup> . Suppose that <sup>L</sup>μ,d <sup>≤</sup> is not a safety language. Let <sup>W</sup> be a weight-sequence in the complement of <sup>L</sup>μ,d <sup>≤</sup> such that <sup>W</sup> does not have a bad prefix. Then the following hold: (a). *DS*(W, d) > 0 (b). For all <sup>i</sup> <sup>≥</sup> 0, the <sup>i</sup>-length prefix <sup>W</sup>[i] of <sup>W</sup> can be extended to an infinite and bounded weight-sequence <sup>W</sup>[i] · <sup>Y</sup> <sup>i</sup> such that *DS*(W[i] · <sup>Y</sup> <sup>i</sup> , d) <sup>≤</sup> 0.

Note that *DS*(W, d) = *DS*(W[i], d) + <sup>1</sup> <sup>d</sup>*<sup>i</sup>* · *DS*(W[i... ], d) where <sup>W</sup>[i... ] = WiWi+1 ... and *DS*(W[i], d) is the discounted-sum of the finite sequence W[i] i.e. *DS*(W[i], d) = Σj=i−<sup>1</sup> j=0 W[j] <sup>d</sup>*<sup>j</sup>* . Similarly, *DS*(W[i] · <sup>Y</sup> <sup>i</sup> , d) = *DS*(W[i], d) + <sup>1</sup> d*<sup>i</sup>* · *DS*(Y <sup>i</sup> , d). The contribution of tail sequences W[i... ] and Y <sup>i</sup> to the discountedsum of <sup>W</sup> and <sup>W</sup>[i] · <sup>Y</sup> <sup>i</sup> , respectively, diminishes exponentially as the value of <sup>i</sup> increases. In addition, since <sup>W</sup> and <sup>W</sup>[i] · <sup>Y</sup> <sup>i</sup> share a common <sup>i</sup>-length prefix W[i], their discounted-sum values must converge to each other. The discounted sum of <sup>W</sup> is fixed and greater than 0, due to convergence there must be a <sup>k</sup> <sup>≥</sup> <sup>0</sup> such that *DS*(W[k] · <sup>Y</sup> <sup>k</sup>, d) <sup>&</sup>gt; 0. Contradiction to (b).

Therefore, DS-comparison language with ≤ is a safety language. 

Semantically this result implies that for a bounded-weight sequence C and rational discount-factor d > 1, if *DS*(C, d) > 0 then C must have a finite prefix Cpre such that the discounted-sum of the finite prefix is so large that no infinite extension by bounded weight-sequence <sup>Y</sup> can reduce the discounted-sum of <sup>C</sup>pre · <sup>Y</sup> with the same discount-factor d to zero or below.

Prior work shows that DS-comparison languages are expressed by B¨uchi automata iff the discount-factor is an integer [13]. Therefore:

**Corollary 1.** *Let* μ > 1 *be the upper bound. For* integer *discount-factor* d > 1


Lastly, it is worth mentioning that for the same reason [13] DS-comparators for non-integer rational discount-factors do not form safety or co-safety automata.

#### **3.2 Deterministic DS-comparator for Integer Discount-Factor**

This section issues deterministic safety/co-safety constructions for DScomparators with integer discount-factors. This is different from prior works since they supply non-deterministic B¨uchi constructions only [11,12]. An outcome of DS-comparators being regular safety/co-safety (Corollary 1) is a proof that DS-comparators permit deterministic B¨uchi constructions, since nondeterministic and deterministic safety automata (and co-safety automata) have equal expressiveness [26]. Therefore, one way to obtain deterministic B¨uchi construction for DS-comparators is to determinize the non-deterministic constructions using standard procedures [26,36]. However, this will result in exponentially larger deterministic constructions. To this end, this section offers direct deterministic safety/co-safety automata constructions for DS-comparator that not only avoid an exponential blow-up but also match their non-deterministic counterparts in number of states (Theorem 3).

*Key ideas.* Due to duality and closure properties of safety/co-safety automata, we only present the construction of deterministic safety automata for DScomparator with upper bound μ, integer discount-factor d > 1 and relation <sup>≤</sup>, denoted by <sup>A</sup>μ,d <sup>≤</sup> . We proceed by obtaining a *deterministic finite automaton*, (DFA), denoted by bad(μ, d, <sup>≤</sup>), for the language of bad-prefixes of <sup>A</sup>μ,d <sup>≤</sup> (Theorem 2). Trivial modifications to bad(μ, d, <sup>≤</sup>) will furnish the coveted deterministic safety automata for <sup>A</sup>μ,d <sup>≤</sup> (Theorem 3).

*Construction.* We begin with some definitions. Let W be a *finite* weightsequence. By abuse of notation, the discounted-sum of finite-sequence W with discount-factor <sup>d</sup> is defined as *DS*(W, d) = *DS*(<sup>W</sup> · <sup>0</sup><sup>ω</sup>, d). The *recoverable-gap* of a finite weight-sequences W with discount factor d, denoted gap(W, d), is its normalized discounted-sum: If W = ε (the empty sequence), gap(ε, d) = 0, and gap(W, d) = <sup>d</sup>|W|−<sup>1</sup> · *DS*(W, d) otherwise [15]. Observe that the recoverable-gap has an inductive definition i.e. gap(ε, d) = 0, where ε is the empty weightsequence, and gap(<sup>W</sup> · v, d) = <sup>d</sup> · gap(W, d) + <sup>v</sup>, where <sup>v</sup> ∈ {−μ, . . . , μ}.

This observation influences a sketch for bad(μ, d, <sup>≤</sup>). Suppose all possible values for recoverable-gap of weight sequences forms the set of states. Then, the transition relation of the DFA can mimic the inductive definition of recoverable gap i.e. there is a transition from state <sup>s</sup> to <sup>t</sup> on alphabet <sup>v</sup> ∈ {−μ, . . . , μ} iff <sup>t</sup> <sup>=</sup> <sup>d</sup> · <sup>s</sup> <sup>+</sup> <sup>v</sup>, where <sup>s</sup> and <sup>v</sup> are recoverable-gap values of weight-sequences. There is one caveat here: There are infinitely many possibilities for the values of recoverable gap. We need to limit the recoverable gap values to finitely many values of interest. The core aspect of this construction is to identify these values.

First, we obtain a lower bound on recoverable gap for bad-prefixes of <sup>A</sup>μ,d <sup>≤</sup> :

**Lemma 1.** *Let* μ *and* d > 1 *be the bound and discount-factor, resp. Let* T = <sup>μ</sup> <sup>d</sup>−<sup>1</sup> *be the threshold value. Let* W *be a non-empty, bounded, finite weight-sequence. Weight sequence* <sup>W</sup> *is a bad-prefix of* <sup>A</sup>μ,d <sup>≤</sup> *iff* gap(W, d) <sup>&</sup>gt; <sup>T</sup>*.*

*Proof.* Let a finite weight-sequence <sup>W</sup> be a bad-prefix of <sup>A</sup>μ,d <sup>≤</sup> . Then, *DS*(<sup>W</sup> · Y,d) <sup>&</sup>gt; 0 for all infinite and bounded weight-sequences <sup>Y</sup> . Since *DS*(<sup>W</sup> · Y,d) = *DS*(W, d) + <sup>1</sup> <sup>d</sup>|*W*<sup>|</sup> · *DS*(Y, d), we get inf(*DS*(W, d) + <sup>1</sup> <sup>d</sup>|*W*<sup>|</sup> · *DS*(Y, d)) <sup>&</sup>gt; 0 =<sup>⇒</sup> *DS*(W, d)++ <sup>1</sup> <sup>d</sup>|*W*<sup>|</sup> · inf(*DS*(Y, d)) <sup>&</sup>gt; 0 as <sup>W</sup> is a fixed sequence. Hence *DS*(W, d) + <sup>−</sup><sup>T</sup> <sup>d</sup>|*W*|−<sup>1</sup> <sup>&</sup>gt; 0 =<sup>⇒</sup> gap(W, d) <sup>−</sup> T > 0. Conversely, for all infinite, bounded, weight-sequence <sup>Y</sup> , *DS*(<sup>W</sup> · Y,d)·d|W|−<sup>1</sup> <sup>=</sup> gap(W, d)+ 1 <sup>d</sup> · *DS*(Y, d). Since gap(W, d) > T, inf(*DS*(Y, d)) = <sup>−</sup><sup>T</sup> · <sup>d</sup>, we get *DS*(<sup>W</sup> · Y,d) <sup>&</sup>gt; 0. 

Since all finite and bounded extensions of bad-prefixes are also bad-prefixes, Lemma 1 implies that if the recoverable-gap of a finite sequence is strinctly lower that threshold T, then recoverable gap of all of its extensions also exceed T. Since recoverable gap exceeding threshold T is the precise condition for badprefixes, all states with recoverable gap exceeding T can be merged into a single state. Note, this state forms an accepting sink in bad(μ, d, <sup>≤</sup>).

Next, we attempt to merge very low recoverable gap value into a single state. For this purpose, we define *very-good prefixes* for <sup>A</sup>μ,d <sup>≤</sup> : A finite and bounded weight-sequence <sup>W</sup> is a *very good* prefix for language of <sup>A</sup>μ,d <sup>≤</sup> if for all infinite, bounded extensions of <sup>W</sup> by <sup>Y</sup> , *DS*(<sup>W</sup> · Y,d) <sup>≤</sup> 0. A proof similar to Lemma <sup>1</sup> proves an upper bound for the recoverable gap of very-good prefixes of <sup>A</sup>μ,d <sup>≤</sup> :

**Lemma 2.** *Let* μ *and* d > 1 *be the bound and discount-factor, resp. Let* T = <sup>μ</sup> <sup>d</sup>−<sup>1</sup> *be the threshold value. Let* W *be a non-empty, bounded, finite weight-sequence. Weight-sequence* <sup>W</sup> *is a very-good prefix of* <sup>A</sup>μ,d <sup>≤</sup> *iff* gap(W, d) ≤ −T*.*

Clearly, finite extensions of very-good prefixes are also very-good prefixes. Further, bad(μ, d, <sup>≤</sup>) must not accept very-good prefixes. Thus, by reasoning as earlier we get that all recoverable gap values that are less than or equal to −T can be merged into one non-accepting sink state in bad(μ, d, <sup>≤</sup>).

Finally, for an integer discount-factor the recoverable gap is an integer. Let x denote the floor of <sup>x</sup> <sup>∈</sup> <sup>R</sup> e.g. 2.3 = 2, −2 <sup>=</sup> <sup>−</sup>2, −2.3 <sup>=</sup> <sup>−</sup>3. Then,

**Corollary 2.** *Let* μ *be the bound and* d > 1 *an integer discount-factor. Let* T = <sup>μ</sup> <sup>d</sup>−<sup>1</sup> *be the threshold. Let* <sup>W</sup> *be a non-empty, bounded, finite weight-sequence.*


So, the recoverable gap value is either one of {−T + 1,..., T}, or less than or equal to −T, or greater than T. This curbs the state-space to <sup>O</sup>(μ)-many values of interest, as T = <sup>μ</sup> <sup>d</sup>−<sup>1</sup> <sup>&</sup>lt; <sup>μ</sup>·<sup>d</sup> <sup>d</sup>−<sup>1</sup> and 1 <sup>&</sup>lt; <sup>d</sup> <sup>d</sup>−<sup>1</sup> <sup>≤</sup> 2. Lastly, since gap(ε, d) = 0, state 0 must be the initial state.

*Construction of* bad(μ, d, <sup>≤</sup>). Let <sup>μ</sup> be the upper bound, and d > 1 be the integer discount-factor. Let T = <sup>μ</sup> <sup>d</sup>−<sup>1</sup> be the threshold value. The finite-state automata bad(μ, d, <sup>≤</sup>)=(S, s<sup>I</sup> , Σ, δ, <sup>F</sup>) is defined as follows:

	- 1. If <sup>s</sup> ∈ {bad, veryGood}, then <sup>t</sup> <sup>=</sup> <sup>s</sup> for all <sup>a</sup> <sup>∈</sup> <sup>Σ</sup>

2. If <sup>s</sup> ∈ {−T + 1,..., T}, and <sup>a</sup> <sup>∈</sup> <sup>Σ</sup>


**Theorem 2.** *Let* μ *be the upper bound,* d > 1 *be the integer discount-factor.* bad(μ, d, <sup>≤</sup>) *accepts finite, bounded, weight-sequence iff it is a bad-prefix of* <sup>A</sup>μ,d <sup>≤</sup> *.*

*Proof (Proof sketch).* First note that the transition relation is deterministic and complete. Therefore, every word has a unique run in bad(μ, d, <sup>≤</sup>). Let last be the last state in the run of finite, bounded, weight-sequence W in the DFA. Use induction on the length of W to prove the following:


Therefore, a finite, bounded weight-sequence is accepted iff its recoverable gap is greater than T. In other words, iff it is a bad-prefix of <sup>A</sup>μ,d <sup>≤</sup> . 

Aμ,d <sup>≤</sup> is obtained from bad(μ, d, <sup>≤</sup>) by applying co-B¨uchi acceptance condition.

**Theorem 3.** *Let* μ *be the upper bound, and* d > 1 *be the integer discount-factor. DS-comparator for all inequalities and equality are either deterministic safety or deterministic co-safety automata with* <sup>O</sup>(μ) *states.*

As a matter of fact, the most compact non-deterministic DS-comparator constructions with parameters <sup>μ</sup>, <sup>d</sup> and <sup>R</sup> also contain <sup>O</sup>(μ) states [11].

#### **3.3 Quantitative Inclusion with Safety/Co-safety Comparators**

This section investigates quantitative language inclusion with regular safety/cosafety comparators. Unlike quantitative inclusion with regular comparators, quantitative inclusion with regular safety/co-safety comparators is able to circumvent B¨uchi complementation with intermediate subset-construction steps. As a result, complexity of quantitative inclusion with regular safety/co-safety comparator is lower than the same with regular comparators [12] (Theorem 4). Finally, since DS-comparators are regular safety/co-safety comparators, the algorithm for quantitative inclusion with regular safety/co-safety comparators applies to DS-inclusion yielding a lower complexity algorithm for DS-inclusion (Corollary 5).

*Key Ideas* A run of word w in a weighted-automaton is *maximal* if its weight is the supremum weight of all runs of w in the weighted-automaton. A run ρ<sup>P</sup> of <sup>w</sup> in <sup>P</sup> is a *counterexample* for <sup>P</sup> <sup>⊆</sup> <sup>Q</sup> (or <sup>P</sup> <sup>⊂</sup> <sup>Q</sup>) iff there exists a maximal run *sup*<sup>Q</sup> of <sup>w</sup> in <sup>Q</sup> such that *wt*(ρ<sup>P</sup> ) > wt(*sup*Q) (or *wt*(ρ<sup>P</sup> ) <sup>≥</sup> wt(*sup*Q)). Consequently, <sup>P</sup> <sup>⊆</sup> <sup>Q</sup> (or <sup>P</sup> <sup>⊂</sup> <sup>Q</sup>) iff there are no counterexample runs in <sup>P</sup>. Therefore, the roadmap to solve quantitative inclusion for regular safety/cosafety comparators is as follows:


3. Solve quantitative inclusion for safety/co-safety comparator by checking for emptiness of the counterexample (Theorem 4). Finally, since DS-comparators are regular safety/co-safety automaton (Corollary 1), apply Theorem 4 to obtain an algorithm for DS-inclusion that uses regular safety/co-safety comparators (Corollary 5).

Let W be a weighted automaton. Then the *annotated automaton* of W, denoted by <sup>W</sup><sup>ˆ</sup> , is the B¨uchi automaton obtained by transforming transition <sup>s</sup> <sup>a</sup> −→ <sup>t</sup> with weight <sup>v</sup> in <sup>W</sup> to transition <sup>s</sup> a,v −−→ <sup>t</sup> in <sup>W</sup><sup>ˆ</sup> . Observe that <sup>W</sup><sup>ˆ</sup> is a safety automaton since all its states are accepting. A run on word w with weight sequence wt in W corresponds to an *annotated word* (w, wt) in Wˆ , and vice-versa.

**Maximal Automaton.** This section covers the construction of the *maximal automaton* from a weighted automaton. Let W and Wˆ be a weighted automaton and its annotated automaton, respectively. We call an annotated word (w, wt1) in <sup>W</sup><sup>ˆ</sup> *maximal* if for all other words of the form (w, wt2) in <sup>W</sup><sup>ˆ</sup> , *wt*(wt1) <sup>≥</sup> *wt*(wt2). Clearly, (w, wt1) is a maximal word in Wˆ iff word w has a run with weight sequence wt<sup>1</sup> in W that is maximal. We define *maximal automaton* of weighted automaton W, denoted Maximal(W), to be the automaton that accepts all maximal words of its annotated automata Wˆ .

We show that when the comparator is regular safety/co-safety, the construction of the maximal automata incurs a 2O(n) blow-up. This section exposes the construction for maximal automaton when comparator for non-strict inequality is regular safety. The other case when the comparator for strict inequality is regular co-safety has been deferred to the appendix.

**Lemma 3.** *Let* W *be a weighted automaton with regular safety comparator for non-strict inequality. Then the language of* Maximal(W) *is a safety language.*

*Proof (Proof sketch).* An annotated word (w, wt1) is not maximal in Wˆ for one of the following two reasons: Either (w, wt1) is not a word in Wˆ , or there exists another word (w, wt2) in Wˆ s.t. *wt*(wt1) < *wt*(wt2) (equivalently (wt1, wt2) is not in the comparator non-strict inequality). Both Wˆ and comparator for nonstrict inequality are safety languages, so the language of maximal words must also be a safety language. 

We now proceed to construct the safety automata for Maximal(W)

*Intuition.* The intuition behind the construction of maximal automaton follows directly from the definition of maximal words. Let Wˆ be the annotated automaton for weighted automaton W. Let Σˆ denote the alphabet of Wˆ . Then an annotated word (w, wt1) <sup>∈</sup> <sup>Σ</sup>ˆ<sup>ω</sup> is a word in Maximal(W) if (a) (w, wt1) <sup>∈</sup> <sup>W</sup><sup>ˆ</sup> , and (b) For all words (w, wt2) <sup>∈</sup> <sup>W</sup><sup>ˆ</sup> , *wt*(wt1) <sup>≥</sup> *wt*(wt2).

The challenge here is to construct an automaton for condition (b). Intuitively, this automaton simulates the following action: As the automaton reads word (w, wt1), it must spawn all words of the form (w, wt2) in Wˆ , while also ensuring that *wt*(wt1) <sup>≥</sup> *wt*(wt2) holds for every word (w, wt2) in <sup>W</sup><sup>ˆ</sup> . Since <sup>W</sup><sup>ˆ</sup> is a safety automaton, for a word (w, wt1) <sup>∈</sup> <sup>Σ</sup>ˆω, all words of the form (w, wt2) <sup>∈</sup> <sup>W</sup><sup>ˆ</sup> can be traced by subset-construction. Similarly since the comparator C for non-strict inequality (≥) is a safety automaton, all words of the form (wt1, wt2) <sup>∈</sup> <sup>C</sup> can be traced by subset-construction as well. The construction needs to carefully align the word (w, wt1) with the all possible (w, wt2) <sup>∈</sup> <sup>W</sup><sup>ˆ</sup> and (wt1, wt2) <sup>∈</sup> <sup>C</sup>.

*Construction of* Maximal(W). Let W be a weighted automaton, with annotated automaton Wˆ and C denote its regular safety comparator for non-strict inequality. Let S<sup>W</sup> denote the set of states of W (and Wˆ ) and S<sup>C</sup> denote the set of states of <sup>C</sup>. We define Maximal(W)=(S, s<sup>I</sup> , Σ, δ, <sup>ˆ</sup> <sup>F</sup>) as follows:

	- 1. s (a,v) −−−→ <sup>s</sup> is a transition in <sup>W</sup><sup>ˆ</sup> , and
	- 2. (t <sup>j</sup> , c <sup>j</sup> ) <sup>∈</sup> <sup>X</sup> if there exists (ti, ci) <sup>∈</sup> <sup>X</sup>, and a weight <sup>v</sup> such that <sup>t</sup><sup>i</sup> a,v- −−→ <sup>t</sup> j and c<sup>i</sup> v,v- −−→ <sup>c</sup> <sup>j</sup> are transitions in Wˆ and C, respectively.

**Lemma 4.** *Let* W *be a weighted automaton with regular safety comparator* C *for non-strict inequality. Then the size of* Maximal(W) *is* <sup>|</sup>W| · <sup>2</sup>O(|W|·|C|) *.*

*Proof (Proof sketch).* A state (s, {(t1, c1),...,(tn, cn)}) is non-accepting in the automata if one of s,t<sup>i</sup> or c<sup>j</sup> is non-accepting in underlying automata Wˆ and the comparator. Since Wˆ and the comparator automata are safety, all outgoing transitions from a non-accepting state go to non-accepting state in the underlying automata. Therefore, all outgoing transitions from a non-accepting state in Maximal(W) go to non-accepting state in Maximal(W). Therefore, Maximal(W) is a safety automaton. To see correctness of the transition relation, one must prove that transitions of type (1.) satisfy condition (a), while transitions of type (2.) satisfy condition (b). Maximal(W) forms the conjunction of (a) and (b), hence accepts the language of maximal words of W.

A similar construction proves that the maximal automata of weighted automata W with regular safety comparator C for strict inequality contains <sup>|</sup>W| · <sup>2</sup>O(|W|·|C|) states. In this case, however, the maximal automaton may not be a safety automaton. Therefore, Lemma 4 generalizes to:

**Corollary 3.** *Let* W *be a weighted automaton with regular safety/co-safety comparator* <sup>C</sup>*. Then* Maximal(W) *is a B¨uchi automaton of size* <sup>|</sup>W| · <sup>2</sup>O(|W|·|C|) *.*

**Counterexample Automaton.** This section covers the construction of the counterexample automaton. Given weighted-automata P and Q, an annotated word (w, wt<sup>P</sup> ) in annotated automata <sup>P</sup><sup>ˆ</sup> is a *counterexample word* of <sup>P</sup> <sup>⊆</sup> <sup>Q</sup> (or <sup>P</sup> <sup>⊂</sup> <sup>Q</sup>) if there exists (w, wtQ) in Maximal(Q) s.t. *wt*(wt<sup>P</sup> ) <sup>&</sup>gt; *wt*(wtQ) (or *wt*(wt<sup>P</sup> ) <sup>≥</sup> *wt*(wtQ)). Clearly, annotated word (w, wt<sup>P</sup> ) is a counterexample word iff there exists a counterexample run of w with weight-sequence wt<sup>P</sup> in P.

For this section, we abbreviate strict and non-strict to strct and nstrct, respectively. For inc ∈ {strct, nstrct}, the *counterexample automaton* for incquantitative inclusion, denoted by Counterexample(inc), is the automaton that contains all counterexample words of the problem instance. We construct the counterexample automaton as follows:

**Lemma 5.** *Let* P*,* Q *be weighted-automata with regular safety/co-safety comparators. For* inc ∈ {strct, nstrct}*,* Counterexample(inc) *is a B¨uchi automaton.*

*Proof.* We construct B¨uchi automaton Counterexample(inc) for inc ∈ {strct, nstrct} that contains the counterexample words of inc-quantitative inclusion. Since the comparator are regular safety/co-safety, Maximal(Q) is a B¨uchi automaton (Corollary 3). Construct the product <sup>P</sup>ˆ×Maximal(Q) such that transition (p1, q1) a,v1,v<sup>2</sup> −−−−→ (p1, q2) is in the product iff <sup>p</sup><sup>1</sup> a,v<sup>1</sup> −−→ <sup>p</sup><sup>1</sup> and <sup>q</sup><sup>1</sup> a,v<sup>2</sup> −−→ <sup>q</sup><sup>2</sup> are transitions in Pˆ and Maximal(Q), respectively. A state (p, q) is accepting if both p and q are accepting in Pˆ and Maximal(Q). One can show that the product accepts (w, wt<sup>P</sup> , wtQ) iff (w, wt<sup>P</sup> ) and (w, wtQ) are words in Pˆ and Maximal(Q), respectively.

If inc <sup>=</sup> strct, intersect <sup>P</sup><sup>ˆ</sup> <sup>×</sup> Maximal(Q) with comparator for <sup>≥</sup>. If inc <sup>=</sup> nstrct, intersect <sup>P</sup><sup>ˆ</sup> <sup>×</sup> Maximal(Q) with comparator for <sup>&</sup>gt;. Since the comparator is a safety or co-safety automaton, the intersection is taken without the cyclic counter. Therefore, (s1, t1) a,v1,v<sup>2</sup> −−−−→ (s2, t2) is a transition in the intersection iff s1 a,v1,v<sup>2</sup> −−−−→ <sup>s</sup><sup>2</sup> and <sup>t</sup><sup>1</sup> v1,v<sup>2</sup> −−−→ <sup>t</sup><sup>2</sup> are transitions in the product and the appropriate comparator, respectively. State (s, t) is accepting if both s and t are accepting. The intersection will accept (w, wt<sup>P</sup> , wtQ) iff (w, wt<sup>P</sup> ) is a counterexample of inc-quantitative inclusion. Counterexample(inc) is obtained by projecting out the intersection as follows: Transition <sup>m</sup> a,v1,v<sup>2</sup> −−−−→ <sup>n</sup> is transformed to <sup>m</sup> a,v<sup>1</sup> −−→ <sup>n</sup>. 

**Quantitative Inclusion and DS-inclusion.** In this section, we give the final algorithm for quantitative inclusion with regular safety/co-safety comparators. Since DS-comparators are regular safety/co-safety comparators, this gives us an algorithm for DS-inclusion with improved complexity than previous results.

**Theorem 4.** *Let* P*,* Q *be weighted-automata with regular safety/co-safety comparators. Let* <sup>C</sup><sup>≤</sup> *and* <sup>C</sup><sup>&</sup>lt; *be the comparators for* <sup>≤</sup> *and* <sup>&</sup>lt;*, respectively. Then*


*Proof.* Strict and non-strict are abbreviated to strct and nstrct, respectively. For inc ∈ {strct, nstrct}, inc-quantitative inclusion holds iff Counterexample(inc) is empty. Size of Counterexample(inc) is the product of size of P, Maximal(Q) (Corollary 3), and the appropriate comparator as described in Lemma 5. 

In contrast, quantitative inclusion with regular comparators reduces to emptiness of a B¨uchi automaton with <sup>|</sup>P| · <sup>2</sup>O(|<sup>P</sup> ||Q||C|·log(|<sup>P</sup> ||Q||C|)) states [12]. The 2O(<sup>n</sup> log <sup>n</sup>) blow-up is unavoidable due to B¨uchi complementation. Hence, quantitative inclusion with regular safety/co-safety has lower worst-case complexity.

Lastly, we use the results of developed in previous sections to solve DSinclusion. Since DS-comparators are regular safety/co-safety (Corollary 1), an immediate consequence of Theorem 4 is an improvement in the worst-case complexity of DS-inclusion in comparison to prior results with regular DScomparators. Furthermore, since the regular safety/co-safety DS-comparators are of the same size for all inequalities (Theorem 3), we get:

**Corollary 4.** *Let* P*,* Q *be weighted-automata, and* C *be a regular safety/cosafety DS-comparator with integer discount-factor* d > 1*.* Strict DS-inclusion *reduces to emptiness checking of a* safety automaton *of size* <sup>|</sup>P||C||Q|·2O(|Q|·|C|)*.*

*Proof (Proof sketch).* When comparator for non-strict inequality is safetyautomaton, as it is for DS-comparator, the maximal automaton is a safety automaton (Lemma 3). One can then show that the counterexample automata is also a safety automaton.

A similar argument proves *non-strict DS-inclusion* reduces to emptiness of a *weak-B¨uchi automaton* [27] of size <sup>|</sup>P||C||Q| · <sup>2</sup>O(|Q|·|C|) (see Appendix).

**Corollary 5 (**[DS-inclusion with safety/co-safety comparator**).** *Let* P*,* Q *be weighted-automata, and* C *be a regular (co)-safety DS-comparator with integer discount-factor* d > <sup>1</sup>*.The complexity of DS-inclusion is* <sup>|</sup>P||C||Q| · <sup>2</sup>O(|Q|·|C|) *.*

#### **4 Implementation and Experimental Evaluation**

The goal of the empirical analysis is to examine performance of DS-inclusion with integer discount-factor with safety/co-safety comparators against existing tools to investigate the practical merit of our algorithm. We compare against (a) Regular-comparator based tool QuIP, and (b) DS-determinization and linearprogramming tool DetLP.

QuIP is written in C++, and invokes state-of-the-art B¨uchi language inclusion-solver RABIT [2]. We enable the -fast flag in RABIT, and tune its Java-threads with Xss, Xms, Xmx set to 1GB, 1GB and 8GB, respectively. DetLP is also written in C++, and uses linear programming solver GLPSOL provided by GLPK (GNU Linear Prog. Kit) [1]. We compare these tools along two axes: runtime and number of benchmarks solved.

**Fig. 1.** <sup>s</sup>*<sup>P</sup>* <sup>=</sup> <sup>s</sup>*<sup>Q</sup>* on <sup>x</sup>-axis, wt = 4, <sup>δ</sup> = 3, <sup>d</sup> = 3, <sup>P</sup> <sup>⊂</sup> <sup>Q</sup>

**Implementation Details.** The algorithm for strict-DS-inclusion with integer discount factor d > 1 proposed in Corollary 4 and non-strict DS-inclusion checks for emptiness of the counterexample automata. A naive algorithm will construct the counterexample automata fully, and then check if they are empty by ensuring the absence of an *accepting lasso*.

We implement a more efficient algorithm. In our implementation, we make use of the fact that the constructions for DS-inclusion use subset-construction intermediate steps. This facilitates an *on-the-fly procedure* since successor states of state in the counterexample automata can be determined directly from input weighted automata and the comparator automata. The algorithm terminates as soon as an accepting lasso is detected. When an accepting lasso is absent, the algorithm traverses all states and edges of the counterexample automata.

We implement the optimized on-the-fly algorithm in a prototype QuIPFly. QuIPFly is written in Python 2.7.12. QuIPFly employs basic implementation-level optimizations to avoid excessive re-computation.

**Design and Setup for Experiments.** Due to lack of standardized benchmarks for weighted automata, we follow a standard approach to performance evaluation of automata-theoretic tools [3,30,38] by experimenting with *randomly generated* benchmarks, using random benchmark generation procedure described in [11].

The parameters for each experiment are number of states s<sup>P</sup> and s<sup>Q</sup> of weighted automata, transition density δ, maximum weight wt, integer discountfactor <sup>d</sup>, and inc ∈ {strct, nstrct}. In each experiment, weighted automata <sup>P</sup> and Q are randomly generated, and runtime of inc-DS-inclusion for all three tools is reported with a timeout of 900 s. We run the experiment for each parameter tuple 50 times. All experiments are run on a single node of a high-performance cluster consisting of two quad-core Intel-Xeon processor running at 2.83 GHz,

**Fig. 2.** <sup>s</sup>*<sup>P</sup>* <sup>=</sup> <sup>s</sup>*<sup>Q</sup>* = 75, wt = 4, <sup>δ</sup> = 3, <sup>d</sup> = 3, <sup>P</sup> <sup>⊂</sup> <sup>Q</sup>

with 8 GB of memory per node. We experiment with s<sup>P</sup> = s<sup>Q</sup> ranging from 0– 1500 in increments of 25, <sup>δ</sup> ∈ {3, <sup>3</sup>.5, <sup>4</sup>}, <sup>d</sup> = 3, and wt ∈ {d<sup>1</sup> + 1, d<sup>3</sup> <sup>−</sup> <sup>1</sup>, d<sup>4</sup> <sup>−</sup> <sup>1</sup>}.

**Observations and Inferences.**<sup>1</sup> For clarity of exposition, we present the observations for only one parameter-tuple. Trends and observations for other parameters were similar.

QuIPFly Outperforms. QuIP by at least an order of magnitude in runtime. Figure 1 plots the median runtime of all 50 experiments for the given parametervalues for QuIP and QuIPFly. More importantly, QuIPFly solves all of our benchmarks within a fraction of the timeout, whereas QuIP struggled to solve at least 50% of the benchmarks with larger inputs (beyond s<sup>P</sup> = s<sup>Q</sup> = 1000). Primary cause of failure is memory overflow inside RABIT. We conclude that regular safety/co-safety comparators outperform their regular counterpart, giving credit to the simpler subset-constructions vs. B¨uchi complementation.

*QuIPFly Outperforms. DetLP* comprehensively in runtime and in number of benchmarks solved. We were unable to plot DetLP in Fig. 1 since it solved fewer than 50% benchmarks even with small input instances. Figure 2 compares the runtime of both tools on the same set of 50 benchmarks for a representative parameter-tuple on which all 50 benchmarks were solved. The plot shows that QuIPFly beats DetLP by 2–4 orders of magnitude on all benchmarks.

*Overall Verdict.* Overall, QuIPFly outperforms QuIP and DetLP by a significant margin along both axes, runtime and number of benchmarks solved. This analysis gives unanimous evidence in favor of our safety/co-safety approach to solving DS-inclusion.

<sup>1</sup> Figures are best viewed online and in color.

#### **5 Concluding Remarks**

The goal of this paper was to build scalable algorithms for DS-inclusion. To this end, this paper furthers the understanding of language-theoretic properties of discounted-sum aggregate function by demonstrating that DS-comparison languages form safety and co-safety languages, and utilizes these properties to obtain a decision procedure for DS-inclusion that offers both tighter theoretical complexity and improved scalability. All in all, the key insights of this work are:


To the best of our knowledge, this is the first work that applies language-theoretic properties such as safety/co-safety in the context of quantitative reasoning.

More broadly, this paper demonstrates that the close integration of languagetheoretic and quantitative properties can render novel algorithms for quantitative reasoning that can benefit from advances in qualitative reasoning.

**Acknowledgements.** We thank anonymous reviewers for their comments. We thank D. Fried, L. M. Tabajara, and A. Verma for their valuable inputs on initial drafts of the paper. This work was partially supported by NSF Grant No. CCF-1704883.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Clock Bound Repair for Timed Systems**

Martin Kolbl ¨ 1(B) , Stefan Leue1(B) , and Thomas Wies2(B)

<sup>1</sup> University of Konstanz, Konstanz, Germany *{*Martin.Koelbl,Stefan.Leue*}*@uni-konstanz.de <sup>2</sup> New York University, New York, NY, USA wies@cs.nyu.edu

**Abstract.** We present algorithms and techniques for the repair of timed system models, given as networks of timed automata (NTA). The repair is based on an analysis of timed diagnostic traces (TDTs) that are computed by real-time model checking tools, such as UPPAAL, when they detect the violation of a timed safety property. We present an encoding of TDTs in linear real arithmetic and use the MaxSMT capabilities of the SMT solver Z3 to compute possible repairs to clock bound values that minimize the necessary changes to the automaton. We then present an admissibility criterion, called functional equivalence, that assesses whether a proposed repair is admissible in the overall context of the NTA. We have implemented a proof-of-concept tool called TARTAR for the repair and admissibility analysis. To illustrate the method, we have considered a number of case studies taken from the literature and automatically injected changes to clock bounds to generate faulty mutations. Our technique is able to compute a feasible repair for 91% of the faults detected by UPPAAL in the generated mutants.

**Keywords:** Timed automata · Automated repair · Admissibility of repair · TARTAR tool

#### **1 Introduction**

The analysis of system design models using model checking technology is an important step in the system design process. It enables the automated verification of system properties against given design models. The automated nature of model checking facilitates the integration of the verification step into the design process since it requires no further intervention of the designer once the model has been formulated and the property has been specified.

Often it is sufficient to abstract from real time aspects when checking system properties, in particular when the focus is on functional aspects of the system. However, when non-functional properties, such as response times or the timing of periodic behavior, play an important role, it is necessary to incorporate real time aspects into the models and the specification, as well as to use specialized real-time model checking tools, such as UPPAAL [6], Kronos [31] or opaal [11] during the verification step.

Next to the automatic nature of model checking, the ability to return counterexamples, in real-time model checking often referred to as timed diagnostic traces (TDT), is a further practical benefit of the use of model checking technology. A TDT describes a timed sequence of steps that lead the design model from the initial state of the system into a state violating a real-time property. A TDT neither constitutes a causal explanation of the property violation, nor does it provide hints as to how to correct the model.

In this paper we describe an automated method that computes proposals for possible repairs of a network of timed automata (NTA) that avoid the violation of a timed safety property. Consider the TDT depicted as a time annotated sequence diagram [5] in Fig. 1. This scenario describes a simple message exchange where the process dbServer sends a message req to process db which, after some processing steps returns a message ser to dbServer. Assume a requirement on the system to be that the time from sending req to receiving ser is not to be more than 4 time units. Assume that the timing interval annotations on the sequence diagram represent the minimum and maximum time for the message transmission and processing steps that the NTA, from which the diagram has been derived, permits. It is then easy to see that it is possible to execute the system in such a way that this property is violated.

Various changes to the underlying NTA model, depicted in Fig. 2, may avoid this property violation. For instance, the maximum time it takes to transmit the req and ser messages can be constrained to be at most 1 time unit, respectively. Alternatively, it may be possible to avoid the property violation by reducing two of the three timings by 0.5 time units. In any case, proposing such changes to the model may either serve to correct clerical mistakes made during the editing of the model, or point to necessary changes in the dimensioning of its time resources, thus contributing to improved design space exploration.

The repair method described in this paper relies on an encoding of a TDT as a constraint

system in linear real arithmetic. This encoding provides a symbolic abstract semantics for the TDT by constraining the sojourn time of the NTA in the locations visited along the trace. The constraint system is then augmented by auxiliary model variation variables which represent syntactic changes to the NTA model, for instance the variation of a location invariant condition or a transition guard. We assert that the thus modified constraint system implies the non-reachability of a violation. At the same time, we assert that the model variation variables have a value that implies that no change of the NTA model will occur, for instance by setting a clock bound variation variable to 0. This renders the resulting constraint system unsatisfiable.

In order to compute a repair, we derive a partial MaxSMT instance by turning the constraints that disable any repair into soft constraints. We solve this MaxSMT instance using the SMT solver Z3 [25]. If the MaxSMT instance admits a solution, the resulting model provides values of the model variation variables. These values indicate a repair

**Fig. 1.** TDT represented as a sequence diagram with timing annotations

of the NTA model which entails that along the sequence of locations represented by the TDT, the property violation will no longer be reachable.

In a next step it is necessary to check whether the computed repair is an admissible repair in the context of the full NTA. This is important since the repair was computed locally with respect to only a single given TDT. Thus, it is necessary to define a notion of admissibility that is reasonable and helpful in this setting. To this end, we propose the notion of *functional equivalence* which states that as a result of the computed repair, neither erstwhile existing functional behavior will be purged, nor will new functional behavior be added. Functional behavior in this sense is represented by languages accepted by the untimed automata of the unrepaired and the repaired NTAs. Functional equivalence is then defined as equivalence of the languages accepted by these automata. We propose a zone-based automaton construction for implementing the functional equivalence test that is efficient in practice.

We have implemented our proposed method in a proof-of-concept tool called TAR-TAR1. Our evaluation of TARTAR is based on several non-trivial NTA models taken from the literature, including the frequently considered Pacemaker model [19]. For each model, we automatically generate mutants by injecting clock bound variations which we then model check using UPPAAL and repair using TARTAR. The evaluation shows that our technique is able to compute an admissible repair for 91% of the detected faults.

*Related Work.* There are relatively few results available on a formal treatment of TDTs. The zone based approach to real-time model checking, which relies on a constraintbased abstraction of the state space, is proposed in [14]. The use of constraint solving to perform reachability analysis for NTAs is described in [30]. This approach ultimately leads to the on-the-fly reachability analysis algorithm used in UPPAAL [7]. [12] defines the notion of a time-concrete UPPAAL counterexample. Work documented in [27] describes the computation of concrete delays for symbolic TDTs. The above cited approaches address neither fault analysis nor repair for TDTs. Our use of MaxSMT solvers for computing minimal repairs is inspired by the use MaxSAT solvers for fault localization in C programs, which was first explored in the BugAssist tool [20,21]. Our approach also shares some similarities with syntax-guided synthesis [2,28], which has also been deployed in the context of program repair [22]. One key difference is how we determine the admissibility of a repair in the overall system, which takes advantage of the semantic restrictions imposed by timed automata.

*Structure of the Paper.* We will introduce the automata and real-time concepts needed in our analysis in Sect. 2. In Sect. 3 we present the logical formalization of TDTs. The repair and admissibility analyses are presented in Sects. 4 and 5, respectively. We report on tool development, experimental evaluation and case studies in Sects. 6 and 7 concludes.

<sup>1</sup> TARTAR and links to all models used in this paper can be found at URL https://github.com/ sen-uni-kn/tartar.

#### **2 Preliminaries**

The timed automaton model that we use in this paper is adapted from [7]. Given a set of *clocks* C, we denote by B(C) the set of all *clock constraints* over C, which are conjunctions of *atomic clock constraints* of the form c ∼ n, where c ∈ C, ∼∈ {<, ≤ , <sup>=</sup>, <sup>≥</sup>, >} and <sup>n</sup> <sup>∈</sup> <sup>N</sup>. A *timed automaton (TA)* <sup>T</sup> is a tuple <sup>T</sup> = (L, l0, C, Σ, Θ, I) where L is a finite set of locations, l <sup>0</sup> <sup>∈</sup> <sup>L</sup> is an initial location, <sup>C</sup> is a finite set of clocks, <sup>Σ</sup> is a set of action labels, <sup>Θ</sup> <sup>⊆</sup>*fin* <sup>L</sup> × B(C) <sup>×</sup> <sup>Σ</sup> <sup>×</sup> <sup>2</sup><sup>C</sup> <sup>×</sup> <sup>L</sup> is a set of *actions*, and I : L → B(C) denotes a labeling of locations with clock constraints, referred to as location invariants. For θ ∈ Θ with θ = (l, g, a, r, l ) we refer to g as the *guard* of θ and to r as its *clock resets*.

The operational semantics of T is given by a timed transition system consisting of states <sup>s</sup> = (l, u) where <sup>l</sup> is a location and <sup>u</sup> : <sup>C</sup> <sup>→</sup> <sup>R</sup><sup>+</sup> is a *clock valuation*. The initial state s<sup>0</sup> is (, u0) where u<sup>0</sup> maps all clocks to 0. For a clock constraint B we write u |= B iff B evaluates to true in u. There are two types of transitions. An *action transition* models the execution of an action whose guard is satisfied. These transitions are instantaneous and reset the specified clocks. The passing of time in a location is modeled by *delay transitions*. Both types of transitions guarantee that location invariants are satisfied in the pre and post state. Formally, we have (l, u) <sup>t</sup> −→ (<sup>l</sup> , u ) iff


**Definition 1.** *A* symbolic timed trace *(STT) of* T *is a sequence of actions* S = θ0,..., θ<sup>n</sup>−<sup>1</sup>*. A* realization *of* S *is a sequence of delay values* δ0,...,δ<sup>n</sup> *such that there exists states* s0,...,sn, sn+1 *with* s<sup>i</sup> <sup>δ</sup>*<sup>i</sup>* −→ <sup>θ</sup>*<sup>i</sup>* −→ <sup>s</sup>i+1 *for all* <sup>i</sup> <sup>∈</sup> [0, n) *and* <sup>s</sup><sup>n</sup> <sup>δ</sup>*<sup>n</sup>* −→ <sup>s</sup>n+1*. We say that a STT is* feasible *if it has at least one realization.*

*Property Specification.* We focus on the analysis of timed safety properties, which we characterize by an invariant formula that has to hold for all reachable states of a TA. These properties state, for instance, that there are certain locations in which the value of a clock variable is not above, equal to or below a certain (integer) bound. Formally, let T = (L, l<sup>0</sup>, C, Σ, Θ, I) be a TA. A *timed safety property* Π is a Boolean combination of atomic clock constraints and *location predicates* @l where l ∈ L. A location predicate @l holds in a state (l , u) of T iff l = l. We say that a STT S witnesses a violation of Π in T if there exists a realization of S whose induced final state does not satisfy Π. We refer to such an STT as a *timed diagnostic trace* of T for Π.

T satisfies Π iff all its reachable states satisfy Π. This problem can be decided using model checking tools such as Kronos [31] and UPPAAL [6]. UPPAAL in particular computes a finite abstraction of the state space of an NTA using a zone graph construction. Reachability analysis is then performed by an on-the-fly search of the zone graph. If the property is violated, the tool generates a feasible TDT that witnesses the violation. The objective of our work is to analyze TDTs and to propose repairs for the property violation that they represent. We use TDTs generated by the UPPAAL tool in our implementation, but we maintain that our results can be adapted to any other tool producing TDTs.

We further note that UPPAAL takes a *network of timed automata* (NTA) as input, which is a CCS [24] style parallel composition of timed automata T<sup>1</sup> | ... | Tn. Since our analysis and repair techniques focus on timing-related errors rather than synchronization errors, we use TAs rather than NTAs in our formalization. However, our implementation works on NTAs.

*Example 1.* The running example that we use throughout the paper consists of an NTA of two timed automata, depicted in Fig. 2. As alluded to in the introduction, the TAs dbServer and db synchronize via the exchange of messages modeled by the pairs of send and receive actions req! and req?, respectively, ser! and ser?. The transmission time of the req message is controlled by the clock variable x and can range between 1 and 2 time units. This is achieved by the location invariant x<=2 on the reqReceived location in db together with the transition guard x>=1 on the transition from reqReceived to reqProcessing. A similar mechanism using clock variable z is used to constrain the timing of the transfer of message ser to be within 1 and 2 time units. The processing time in dbServer is constrained to exactly 1 time unit by the location invariant y<=1 and the transition guard y>=1. In dbServer, a transition to location timeout can be triggered when the guard z==2 is satisfied in location serReceiving. The clock variable x, which is not reset until the next req message is sent, is recording the time that has elapsed since sending req and is used in location serReceiving in order to verify if more than 4 time units have passed since req was sent. The timed safety property that we will consider for our example is Π = ¬@dbServer.serReceiving ∨ (x < 4). For the violation of this property, UPPAAL produces the TDT S = θ<sup>0</sup> ...θ<sup>3</sup> where

θ<sup>0</sup> = ((initial, reqAwaiting), ∅,τ, ∅,(reqCreate, reqAwaiting)) θ<sup>1</sup> = ((reqCreate, reqAwaiting), ∅,τ, {x},(reqSent, reqReceived)) θ<sup>2</sup> = ((reqSent, reqReceived), {x ≥ 1},τ, {y},(reqSent, reqProc.)) θ<sup>3</sup> = ((reqSent, reqProc.), {y ≥ 1},τ, {z},(serReceiving, reqAwait.)).

#### **3 Logical Encoding of Timed Diagnostic Traces**

Our analysis relies on a logical encoding of TDTs in the theory of quantifier-free linear real arithmetic. For the remainder of this paper, we fix a TA T = (L, l<sup>0</sup>, C, Σ, Θ, I) with a safety property Π and assume that S = θ0,...,θ<sup>n</sup>−<sup>1</sup> is an STT of T. We use the following notation for our logical encoding where j ∈ [0, n + 1] is a position in a realization of S and c ∈ C is a clock:


**Fig. 2.** Network of timed automata - running example

– *gbounds*(c, θ) denotes the set of pairs (β, ∼) such that the atomic clock constraint c ∼ β appears in the guard of action θ.

To illustrate the use of *ibounds*, assume location l to be labeled with invariants x > 2 ∧ x ≤ 4 ∧ y ≤ 1, then *ibounds*(x, l) = {(2, >),(4, ≤)}. The usage of *gbounds* is accordingly.

**Definition 2.** *The* timed diagnostic trace constraint system *associated with STT* S *is the conjunction* T *of the following constraints:*

$$\begin{aligned} &C\_{0} \equiv \bigwedge\_{c \in C} c\_{0} = 0 & \text{(clockwise initialization)}\\ &\mathcal{A} \equiv \bigwedge\_{j \in [0, n]} \delta\_{j} \ge 0 & \text{(time adjustment)}\\ &\mathcal{R} \equiv \bigwedge\_{c \in \text{next}\_{j}} c\_{j+1} = 0 & \text{(clock resets)}\\ &\mathcal{D} \equiv \bigwedge\_{c \notin \text{next}\_{j}} c\_{j+1} = c\_{j} + \delta\_{j} & \text{(sjoour time)}\\ &\mathcal{I} \equiv \bigwedge\_{c \in \text{next}\_{j}} c\_{j} \sim \beta \wedge c\_{j} + \delta\_{j} \sim \beta & \text{(location invariants)}\\ &\mathcal{G} \equiv \bigwedge\_{(\beta, \sim) \in \text{bounds}(c, \theta\_{j})} c\_{j} + \delta\_{j} \sim \beta & \text{(transition guarantees)}\\ &\mathcal{L} \equiv \mathcal{Q}l\_{n} \wedge \bigwedge\_{l \neq l\_{n}} \neg @l & \text{(location predicates)}\\\end{aligned}$$

*Let further* Φ ≡ Π[**c**n+1/**c**] *where* Π[**c**n+1/**c**] *is obtained from* Π *by substituting all occurrences of clocks* c ∈ C *with* cn+1*. Then the* Π*-extended TDT constraint system associated with* <sup>S</sup> *is defined as* <sup>T</sup> <sup>Π</sup> <sup>=</sup> T ∧¬Φ*.*

To illustrate the encoding consider the transition Θ<sup>3</sup> of the TDT in Example 1 corresponding to the transition from state (reqSent, reqProcessing) to state (serReceiving, reqAwaiting) while resetting clock z in the NTA of Fig. 2. The encoding for the constraints on the clocks x, y and z is as following: y<sup>3</sup> + d<sup>3</sup> ≥ 1, z<sup>4</sup> = 0, x<sup>4</sup> = x<sup>3</sup> + d<sup>3</sup> and y<sup>4</sup> = y<sup>3</sup> + d3.

**Lemma 1.** δ<sup>c</sup> 0,...,δ<sup>c</sup> <sup>n</sup> *is a realization of an STT* S *iff there exists a satisfying variable assignment* <sup>ι</sup> *for* <sup>T</sup> *such that for all* <sup>j</sup> <sup>∈</sup> [0, n]*,* <sup>ι</sup>(δ<sup>j</sup> ) = <sup>δ</sup><sup>c</sup> j *.*

**Theorem 1.** *An STT* <sup>S</sup> *witnesses a violation of* <sup>Π</sup> *in* <sup>T</sup> *iff* <sup>T</sup> <sup>Π</sup> *is satisfiable.*

#### **4 Repair**

We propose a repair technique that analyzes the responsibility of clock bound values occurring in a single TDT for causing the violation of a specification Π. The analysis suggests possible syntactic repairs. In a second step we define an admissibility test that assesses the admissibility of the repair in the context of the complete TA model. Throughout this section, we assume that S is a TDT for T and Π.

*Clock Bound Variation.* We introduce *bound variation variables v* that stand for *correction values* that the repair will add to the clock bounds occurring in location invariants and transition guards. The values are chosen such that none of the realizations of S in the modified automaton still witnesses a violation of Π. This is done by defining a new constraint system that captures the conditions on the variable *v* under which the violation of Π will not occur in the corresponding trace of the modified automaton. Using this constraint system, we then define a maximum satisfiability problem whose solution minimizes the number of changes to T that are needed to achieve the repair.

Recall that the clock bounds occurring in location invariants and in transition guards are represented by the *ibounds* and *gbounds* sets defined for the TDT S. Notice that each clock variable c may be associated with mc,l different clock bounds in the location invariant of <sup>l</sup>, denoted by the set *ibounds*(c, l) = {(βc,l <sup>1</sup> , <sup>∼</sup>c,l <sup>1</sup> ),...,(βc,l <sup>m</sup>*c,l* , <sup>∼</sup>c,l <sup>m</sup>*c,l* )}. Similarly, we enumerate the bounds in *gbounds*(c, θ) as (βc,θ <sup>k</sup> , <sup>∼</sup>c,θ <sup>k</sup> ). To reduce notational clutter, we let the meta variable r stand for the pairs of the form c, l or c, θ. We then introduce bound variation variables *v* <sup>r</sup> <sup>k</sup> describing the possible static variation in the TA code for the clock bound β<sup>r</sup> <sup>k</sup> and modify the TDT constraint system accordingly. A variation of the bounds only affects the location invariant constraints I and the transition guard constraints G. We thus define an appropriate invariant variation constraint <sup>I</sup>*bv* and guard variation constraint <sup>G</sup>*bv* that capture the clock bound modifications:

$$\begin{aligned} \mathcal{T}^{bv} & \equiv \bigwedge\_{(\beta\_k^r, \sim\_k^r) \in \mathit{ibounds}(c, l\_j)} c\_j \sim\_k^r (\beta\_k^r + v\_k^r) \land c\_j + \delta\_j \sim\_k^r (\beta\_k^r + v\_k^r) \\ \mathcal{G}^{bv} & \equiv \bigwedge\_{(\beta\_k^r, \sim\_k^r) \in \mathit{gphounds}(c, \theta\_j)} c\_j + \delta\_j \sim\_k^r (\beta\_k^r + v\_k^r) \end{aligned}$$

We also need constraints ensuring that the modified clock bounds remain positive:

$$\mathcal{Z}^{bv} \equiv \bigwedge\_{(\beta\_k^r, \sim\_k^r) \in ibound(c, l\_j)} \beta\_k^r + v\_k^r \ge 0$$

Putting all of this together we obtain the *bound variation TDT constraint system*

$$\mathcal{T}^{\mathrm{bv}} \equiv \mathcal{C}\_0 \wedge \mathcal{A} \wedge \mathcal{R} \wedge \mathcal{D} \wedge \mathcal{T}^{\mathrm{bv}} \wedge \mathcal{G}^{\mathrm{bv}} \wedge \mathcal{Z}^{\mathrm{bv}} \wedge \mathcal{L}$$

which captures all realizations of S in TAs T*bv* that are obtained from T by modifying the clock bounds β<sup>r</sup> <sup>k</sup> by some semantically consistent variations *v* <sup>r</sup> k .

Consider the bound variation for the guard y ≥ 1 of transition Θ<sup>3</sup> in Example 1. The modified guard constraint, a conjunct in <sup>G</sup>*bv*, is <sup>y</sup><sup>3</sup> <sup>+</sup> <sup>d</sup><sup>3</sup> <sup>≥</sup> 1 + *<sup>v</sup>* <sup>y</sup> <sup>3</sup> . The corresponding non-negativity constraint from <sup>Z</sup>*bv* is 1 + *<sup>v</sup>* <sup>y</sup> <sup>3</sup> ≥ 0.

*Repair by Bound Variation Analysis.* The objective of the bound variation analysis is to provide hints to the system designer regarding which minimal syntactic changes to the considered model might prevent the violation of property Π. Minimality here is considered with respect to the number of clock bound values in invariants and guards that need to be changed.

We implement this analysis by using the bound variation TDT constraint system <sup>T</sup> *bv* to derive an instance of the partial MaxSMT problem whose solutions yield candidate repairs for the timed automaton T. The partial MaxSMT problem takes as input a finite set of assertion formulas belonging to a fixed first-order theory. These assertions are partitioned into *hard* and *soft* assertions. The hard assertions F<sup>H</sup> are assumed to hold and the goal is to find a maximizing subset F ⊆ F<sup>S</sup> of the soft assertions such that F ∪ F<sup>H</sup> is satisfiable in the given theory.

For our analysis, the hard assertions consist of the conjunction

$$\mathcal{F}\_H^{bv} \equiv (\exists \delta\_j, c\_j. \, T^{bv}) \land (\forall \delta\_j, c\_j. \, T^{bv} \Rightarrow \Phi).$$

Note that the free variables of <sup>F</sup>*bv* <sup>H</sup> are exactly the bound variation variables *v* <sup>r</sup> <sup>k</sup> . Given a satisfying assignment <sup>ι</sup> for <sup>F</sup>*bv* <sup>H</sup> , let T<sup>ι</sup> be the timed automaton obtained from T by adding to each clock bound β<sup>r</sup> <sup>k</sup> the according variation value ι(*v* <sup>r</sup> <sup>k</sup> ) and let S<sup>ι</sup> be the TDT corresponding to <sup>S</sup> in <sup>T</sup>ι. Then <sup>F</sup>*bv* <sup>H</sup> guarantees that


We refer to such an assignment ι as a *local clock bound repair* for T and S. To obtain a minimal local clock bound repair, we use the soft assertions given by the conjunction

$$\mathcal{F}\_S^{bv} \equiv \bigwedge\_{(\beta\_k^r, \bullet) \in ibound(c, l\_j)} v\_k^r = 0.$$

Clearly <sup>F</sup>*bv* <sup>H</sup> ∧ F*bv* <sup>S</sup> is unsatisfiable because <sup>T</sup> *bv* ∧ F*bv* <sup>S</sup> is equisatisfiable with T , and T ∧¬Φ is satisfiable by assumption. However, if there exists at least one local clock bound repair for <sup>T</sup> and <sup>S</sup>, then <sup>F</sup>*bv* <sup>H</sup> alone is satisfiable. In this case, the MaxSMT instance <sup>F</sup>*bv* <sup>H</sup> ∪ F*bv* <sup>S</sup> has at least one solution. Every satisfying assignment of such a solution corresponds to a local repair that minimizes the number of clock bounds that need to be changed in T.

Note that hard and soft assertions remain within a decidable logic. Using an SMT solver such as Z3, we can enumerate all the optimal solutions for the partial MaxSMT instance and obtain a minimal local clock bound repair from each of them.

*Example 2.* We have applied the bound variation repair analysis to the TDT from Example 1, using TARTAR, which calls Z3. The following repairs were computed:


#### **5 Admissibility of Repair**

The synthesized repairs that lead to a TA T<sup>ι</sup> change the original TA T in fundamental ways, both syntactically and semantically. This brings up the question whether the synthesized repairs are admissible. In fact, one of the key questions is what notion of admissibility is meaningful in this context.

A *timed trace* [7] is a sequence of timed actions ξ = (t1, a1),(t2, a2),... that is generated by a run of a TA, where t<sup>i</sup> ≤ ti+1 for all i ≥ 1. The timed language for a TA T is the set of all its timed traces, which we denote by L<sup>T</sup> (T). The untimed language of T consists of words over T's alphabet Σ so that there exists at least one timed trace of T forming this word. Formally, for a timed trace ξ = (t1, a1),(t2, a2)..., the untime operator μ(ξ) returns an untimed trace ξ<sup>μ</sup> = a1a2.... We define the untimed language Lμ(T) of the TA T as Lμ(T) = {μ(ξ) | ξ ∈ L<sup>T</sup> (T)}.

Let <sup>B</sup> be a Buchi automaton (BA) [ ¨ 10] over some alphabet <sup>Σ</sup>. We write <sup>L</sup>(B) <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup> for the language accepted by B. Similarly, we denote by L<sup>f</sup> (B) ⊆ Σ<sup>∗</sup> the language accepted by B if it is interpreted as a nondeterministic finite automaton (NFA). Further, we write *pref*(L(B)) to denote the set of all finite prefixes of words in L(B).

For a given NFA or BA M, the *closure* cl(M) denotes the automaton obtained from M by turning all of its states into accepting states. We call M closed iff M = cl(M). Notice that a Buchi automaton accepts a safety language if and only if it is closed [ ¨ 1].

*Admissibility Criteria.* From a *syntactic* point of view the repair obtained from a satisfying assignment ι of the MaxSMT instance ensures that T<sup>ι</sup> is a syntactically valid TA model by, for instance, placing non-negativity constraints on repaired clock bounds. In case repairs alter right hand sides of clock constraints to rational numbers, this can easily be fixed by normalizing all clock constraints in the TA.

From a *semantic* perspective, the impact of the repairs is more profound. Since the repairs affect time bounds in location invariants and transition guards, as well as clock resets, the behavior of T<sup>ι</sup> may be fundamentally different from the behavior of T.


It should be pointed out that we assess admissibility of a repair leading to T<sup>ι</sup> with respect to a given TA model T, and not with respect to a correct TA model T <sup>∗</sup> satisfying Π.

*Functional Equivalence.* While various variants of semantic admissibility may be considered, we are focusing on a notion of admissibility that ensures that a repair does not unduly change the functional behavior of the modeled system while adhering to the timing constraints of the repaired system. We refer to this as *functional equivalence*. The functional capabilities of a timed system manifest themselves in the sets of action or transition traces that the system can execute. For TAs T and T<sup>ι</sup> this means that we need to consider the languages over the action or transition alphabets that these TAs define. Considering the timed languages of T and Tι, we can state that L<sup>T</sup> (T) = L<sup>T</sup> (Tι) since the repair forces at least one timed trace to be purged from L<sup>T</sup> (T). This means that equivalence of the timed languages cannot be an admissibility criterion ensuring functional equivalence. At the other end of the spectrum we may relate the de-timed languages of T and Tι. The *de-time* operator α(T) is defined such that it omits all timing constraints and resets from any TA T. Requiring L(α(T)) = L(α(Tι)) is tempting since it states that when eliminating all timing related features from T and from the repaired Tι, the resulting action languages will be identical.

However, this admissibility criterion would be flawed, since the repair in T<sup>ι</sup> may imply that unreachable locations in T will be reachable in Tι, and vice versa. This may have an impact on the untimed languages, and even though L(α(T)) = L(α(Tι)) it may be that Lμ(T) = Lμ(Tι). To illustrate this point, consider the running example in Fig. 2 and assume the invariant in location dbServer.reqReceiving to be modified from z ≤ 2 to z ≤ 1 in the repaired TA Tι. Applying the de-time operator to T<sup>ι</sup> implies that the location dbServer.timeout, which is unreachable in Tι, becomes reachable in the de-timed model. Since dbServer.timeout is reachable in T, the TA T and T<sup>ι</sup> are not functionally equivalent, even though their de-timed languages are identical. Notice that for the untimed languages Lμ(T) = Lμ(Tι) holds since no timed trace in L<sup>T</sup> (Tι) reaches location timeout, even though such a timed trace exists in L<sup>T</sup> (T). In detail, Lμ(T) contains the untimed trace Θ0Θ1Θ2Θ3Θ<sup>4</sup> that is missing in Lμ(Ti) and where Θ<sup>4</sup> is the transition towards the location dbServer.timeout. As consequence, we resort to considering the untimed languages of T and T<sup>ι</sup> and require Lμ(T) = Lμ(Tι). It is easy to see that Lμ(T) = Lμ(Tι) ⇒ L(α(T)) = L(α(Tι)). In other words, the equivalence of the untimed languages ensures functional equivalence.

*Admissibility Test.* Designing an algorithmic admissibility test for functional equivalence is challenging due to the computational complexity of determining the equivalence of the untimed languages Lμ(T) and Lμ(Tι). While language equivalence is decidable for languages defined by Buchi Automata, it is undecidable for timed lan- ¨ guages [3]. For untimed languages, however, this problem is again decidable [3]. The algorithmic implementation of the test for functional equivalence that we propose proceeds in two steps.


*Automata for Untimed Languages.* The construction of an automaton representing an untimed language, here referred to as an *untime construction*, has so far been proposed based on a region abstraction [3]. The region abstraction is known to be relatively inefficient since the number of regions is, among other things, exponential in the number of clocks [4]. We therefore propose an untime construction based on the construction of a zone automaton [14] which in the worst case is of the same complexity as the region automaton, but on the average is more succinct [7].

**Definition 3 (Untimed Buchi Automaton). ¨** *Assume a TA* T *and the corresponding zone automaton* -T<sup>Z</sup> = (SZ, s<sup>0</sup> <sup>Z</sup>, ΣZ, ΘZ)*. We define the* untimed Buchi automaton ¨ *as the closed BA* B<sup>T</sup> = (S, Σ,→, S0, F) *obtained from* -T<sup>Z</sup> *such that* S = SZ*,* <sup>Σ</sup> <sup>=</sup> <sup>Σ</sup><sup>Z</sup> \ {δ} *and* <sup>S</sup><sup>0</sup> <sup>=</sup> {s<sup>0</sup> <sup>Z</sup>}*. For every transition in* Θ<sup>Z</sup> *with a label* a ∈ Σ *we add a transition to* <sup>→</sup> *created by the rule* (l,z) *<sup>δ</sup>* ❀(l,z↑) *<sup>a</sup>* ❀(l ,z ) (l,z) *<sup>a</sup>* −→(l,z) *with* z<sup>↑</sup> = {v + d|v ∈ z, d ∈ <sup>R</sup>≥<sup>0</sup>}*. In addition, we add self-transitions* (l, z) <sup>τ</sup> −→ (l, z) *to every state* (l, z) ∈ SB*.*

The following observations justify this definition:


**Theorem 2 (Correctness of Untimed Buchi Automaton Construction). ¨** *For an untimed Buchi automaton ¨* B<sup>T</sup> *derived from a TA* T *according to Definition 3 it holds that* L(B<sup>T</sup> ) = Lμ(T)*.*

*Equivalence Check for Untimed Languages.* Given that the zone automaton construction delivers closed BAs we can reduce the admissibility test Lμ(T) = Lμ(Tι) defined over infinite languages to an equivalence test over the finite prefixes of these languages, represented by interpreting the zone automata as NFAs. The following theorem justifies this reduction.

**Theorem 3 (Language Equivalence of Closed BA).** *Given closed Buchi automata ¨* B *and* B *, if* Lf(B) = Lf(B ) *then* L(B) = L(B )*.*

*Discussion.* One may want to adapt the admissibility test so that it only considers divergent traces, e.g., in cases where only unbounded liveness properties need to be preserved by a repair. This can be accomplished as follows. First, an overapproximating non-zenoness test [4] can be applied to T and Tι. If it shows non-zenoness, then one knows that the respective TA does not include convergent traces. If this test fails, a more expensive test needs to be developed. It requires a construction of the untimed Buchi automata using the approach from [ ¨ 3], and subsequently a language equivalence test of the untimed languages accepted by the untimed BAs using, for instance, the automata-theoretic constructions proposed in [9].

#### **6 Case Studies and Experimental Evaluation**

We have implemented the repair computation and admissibility test in a proof-ofconcept tool called TARTAR. We present the architecture of TARTAR and then evaluate the proposed method by applying TARTAR to several case studies.

*Tool Architecture.* The control loop of TARTAR, depicted in Fig. 3, computes repairs for a given UPPAAL model and a given property Π using the following steps:


*Evaluation Strategy.* The evaluation of our analysis is based on ideas taken from mutation testing [18]. Mutation testing evaluates a test set by systematically modifying the program code to be tested and computing the ratio of modifications that are detected by the test set. Real-time system models that contain violations of timed safety properties are not available in significant numbers. We therefore need to seed faults in

**Fig. 3.** Control loop of TARTAR

existing models and check whether those can be found by our automated repair. An objective of mutation testing is that testing a proportion of the possible modification yields satisfactory results [18]. In order to evaluate repairs for erroneous clock bounds in invariants and transition guards we seed modifications to all bounds of clock constraints by the amount of {−10, −1, +1, +0.1·M, +M}, where M is the maximal bound a clock is compared against in a given model. If a thus seeded modification leads to a syntactically invalid UPPAAL model, then UPPAAL returns an exception and we ignore this modification. In analogy to mutation testing, we compute the count of TDTs for which our analysis finds an admissible repair.

*Experiments.* We have applied this modification seeding strategy to eight UPPAAL models (see Table 1). Not all of the models that we considered have been published with a property that can be violated by mutating a clock constraint. For those models, we suggest a suitable timed safety property specifying an invariant condition. In particular, we add a property to the Bando [29] model which ensures that, for as long as the sender is active, its clock never exceeds the value of 28,116 time units. In the FDDI token ring protocol [29], the property that we use checks whether the first member of the ring never remains for more than 140 time units in any given state. The Viking model is taken from the set of test models of opaal [26]. For this model we use a property that checks whether one of the Viking processes can only enter a safe state during the first 60 time units. Note that all of these properties are satisfied by the unmodified models.

The results of the clock bound repair computed by TARTAR for all considered models are summarized in Table 1. The seeded modifications are characterized quantitatively by the count *#Seed* of analyzed modified models, the count *#TDT* of modified models that return a TDT for the considered property, the maximal time T*UP* UPPAAL needs to create a TDT per analyzed model, and the length *Len.* of the longest TDT found. For the computation of a repair we give the count *#Rep.* of all repairs that were computed, the count *#Adm.* of computed admissible repairs, the count of TDTs *#Sol.* for which an admissible repair was found, the maximal time T*QE* that the quantifier elimination required, the average time effort T*<sup>R</sup>* to compute a repair, the standard deviation *SDR* for the computation time of a repair, the time effort T*Adm* for an admissibility check, the maximal count of variables *#Var*, and the maximal count of constraints *#Con.* used in V*bv* <sup>i</sup>+1. The maximal memory consumption was at most 17MB for the repair analysis and 478MB for the admissibility test. We performed all experiments on a computer with an i7-6700K CPU (4.0GHz), 60 GB of RAM and a Linux operating system.

We found 60 TDTs by seeding violations of the timed safety property and TARTAR returned 204 repairs for these TDTs. TARTAR proposed an admissible repair for 55 (91%) TDTs and at least one repair for 57 (95%) TDTs. For 3 out of the total of 14 TDTs found for the SBR model no repair was computed since the timeout of the quantifier elimination was reached after 2 minutes. For all other models, no timeout occurred.

Space limitations do not permit us to describe all models and computed repairs in detail, we therefore focus on the pacemaker case study. One of the modification increases a location invariant of this model that controls the minimal heart period from 400 to 1,600. The modification allows the pacemaker to delay an induced ventricular beat for too long so that this violates the property that the time between two ventricular beats of a heart is never longer than the maximal heart period of 1,000. TARTAR finds three repairs. Two repairs reduce the maximal time delay between two ventricular or articular heart beats of the patient. The repairs are classified as inadmissible. In the model context this appears to be reasonable since the repairs would restrict the environment of the pacemaker, and not the pacemaker itself. The third repair is admissible and reduces the bound modified during the seeding of bound modifications by 600.5. The minimal heart period is then below or equal to the maximal heart period of 1, 000.

*Result Interpretation.* Our repair strategy minimizes the number of repairs but does not optimize the computed value. For instance, in the pacemaker model the computed repair of 600.5 would be a correct and admissible repair even if the value was reduced to 600, which would be the minimal possible repair value.

A comparison of the values T*QE* and T*<sup>R</sup>* reveals that, perhaps unsurprisingly, the quantifier elimination step is computationally almost an order of magnitude more expensive than the repair computation. Overall, the computational cost (T*QE* + T*R*) correlates with the number of variables in the constraint system, which depends in turn on the length of the TDT and the number of clocks referenced along the TDT. Consider, for instance, that the pacemaker model has a TDT of maximal length 9 with 116 variables, and the repair requires 0.193 s and 2.070 MB. On the other hand, the Bando model produces a longer maximal TDT of length 279 with 1,156 variables and requires 6.555 s and 16.650 MB. The impact of the number of clock constraints and clock variables on the computation costs can be seen, for instance, in the data for the pacemaker and FDDI models. While the pacemaker model has a shorter TDT than the Viking model (9 vs. 18), the constraint counts (294 vs. 140) of the pacemaker model are higher than for


**Table 1.** Experimental results for clock bound repair computation using TARTAR

the Viking model, which coincides with a higher computation time (0.193 s vs. 0.042 s) and a higher memory consumption (2.070 MB vs. 0.910 MB) compared to the Viking model.

We analyzed for every TDT the relationship between the length of the TDT and the computation time for a repair (T<sup>r</sup> = T*QE* + T*R*), as well as the relationship between *#Var* and T<sup>r</sup> by estimating Kendall's tau [13]. Kendall's tau is a measurement for the ordinal association between two measured quantities. A correlation is considered significant if the probability p that there is actually no correlation in a larger data set is below a certain threshold. The length of a TDT is significantly related (τ<sup>1</sup> = 0.673, p<.001) to Tr. Also *#Var* is significantly related (τ<sup>2</sup> = 0.759, p<.001) to Tr. *#Var* contains clocks for every step of a TDT, hence the combination of trace length and clock count tends to correlate higher than the trace length on its own. This supports our conjecture that the computation time of a repair depends on the trace length and the clock count.

The admissibility test appears to be quite efficient, with a maximum computation time of 34.120 s for the SBR model, which is one of the more complex models that were considered. We observed that most models were action-deterministic, which has a positive influence on the language equivalence test used during admissibility checking.

#### **7 Conclusion**

We have presented an approach to derive minimal repairs for timed reachability properties of TA and NTA models from TDTs in order to facilitate fault localization and debugging of such models during the design process. Our approach includes a formalization of TDTs using linear real arithmetic, a repair strategy based on MaxSMT solving, the definition of an admissibility criterion and test for the computed repairs, the development of a prototypical analysis and repair tool, and the application of the proposed method to a number of case studies of realistic complexity. To the best of our knowledge, this is the first rigorous treatment of counterexamples in real-time model checking. We are also not aware of any existing repair approaches for TA or NTA models. This makes a comparative experimental evaluation impossible. We have nonetheless observed that our analysis computes a significant number of admissible repairs within realistic computation time bounds and memory consumption.

Future research will address the development and implementation of repair strategies for further syntactic features in TAs and NTAs, including false comparison operators in invariants and guards, erroneous clock variable references, superfluous or missing resets for clocks, and wrong urgent state choices. We will furthermore address the interplay between different repairs and develop refined strategies to determine their admissibility. Finally, we plan to extend the approach developed in this paper to derive criteria for the actual causation of timing property violations in NTA models based on the counterfactual reasoning paradigm for causation.

**Acknowledgments.** We wish to thank Nikolaj Bjorner and Zvonimir Pavlinovic for advice on the use of Z3. We are grateful to Sarah Stoll for helping us with the statistical evaluation of the experimental results. This work is in part supported by the National Science Foundation (NSF) under grant CCF-1350574.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Verifying Asynchronous Interactions via Communicating Session Automata

Julien Lange1(B) and Nobuko Yoshida<sup>2</sup>

<sup>1</sup> University of Kent, Canterbury, UK j.s.lange@kent.ac.uk <sup>2</sup> Imperial College London, London, UK

Abstract. This paper proposes a sound procedure to verify properties of communicating session automata (csa), i.e., communicating automata that include multiparty session types. We introduce a new *asynchronous* compatibility property for csa, called <sup>k</sup>-multiparty compatibility (k-mc), which is a strict superset of the synchronous multiparty compatibility used in theories and tools based on session types. It is decomposed into two bounded properties: (i) a condition called k*-safety* which guarantees that, within the bound, all sent messages can be received and each automaton can make a move; and (ii) a condition called k*-exhaustivity* which guarantees that all k-reachable send actions can be fired within the bound. We show that k-exhaustivity implies existential boundedness, and *soundly and completely* characterises systems where each automaton behaves equivalently under bounds greater than or equal to k. We show that checking <sup>k</sup>-mc is pspace-complete, and demonstrate its scalability empirically over large systems (using partial order reduction).

#### 1 Introduction

Communicating automata are a Turing-complete model of asynchronous interactions [10] that has become one of the most prominent for studying point-to-point communications over unbounded first-in-first-out channels. This paper focuses on a class of communicating automata, called *communicating session automata* (csa), which strictly includes automata corresponding to *asynchronous multiparty session types* [28]. Session types originated as a typing discipline for the π-calculus [27,66], where a session type dictates the behaviour of a process wrt. its communications. Session types and related theories have been applied to the verification and specification of concurrent and distributed systems through their integration in several mainstream programming languages, e.g., Haskell [44,55], Erlang [49], F [48], Go [11,37,38,51], Java [30,31,34,65], OCaml [56], C [52], Python [16,47,50], Rust [32], and Scala [61,62]. Communicating automata and asynchronous multiparty session types [28] are closely related: the latter can be seen as a syntactical representation of the former [17] where a sending state corresponds to an internal choice and a receiving state to an external choice. This correspondence between communicating automata and multiparty session types has become the foundation of many tools centred on session types, e.g., for generating communication API from multiparty session (global) types [30,31,48,61], for detecting deadlocks in message-passing programs [51,67], and for monitoring session-enabled programs [5,16,47,49,50]. These tools rely on a property called *multiparty compatibility* [6,18,39], which guarantees that communicating automata representing session types interact correctly, hence enabling the identification of correct protocols or the detection of errors in endpoint programs. Multiparty compatible communicating automata validate two essential requirements for session types frameworks: every message that is sent can be eventually received and each automaton can always eventually make a move. Thus, they satisfy the *abstract* safety invariant ϕ for session types from [63], a prerequisite for session type systems to guarantee safety of the typed processes. Unfortunately, multiparty compatibility suffers from a severe limitation: it requires that each execution of the system has a synchronous equivalent. Hence, it rules out many correct systems. Hereafter, we refer to this property as *synchronous multiparty compatibility* (smc) and explain its main limitation with Example 1.

*Example 1.* The system in Fig. 1 contains an interaction pattern that is *not* supported by any definition of smc [6,18,39]. It consists of a client (c), a server (s), and a logger (l), which communicate via unbounded fifo channels. Transition sr!*a* denotes that sender puts (asynchronously) message *a* on channel sr; and transition sr?*a* denotes the consumption of *a* from channel sr by receiver. The client sends a *req*uest and some *data* in a fire-and-forget fashion, before waiting for a response from the server. Because of the presence of this simple pattern, the system cannot be executed synchronously (i.e., with the restriction that a send action can only be fired when a matching receive is enabled), hence it is rejected by all definitions of smc from previous works, even though the system is safe (all sent messages are received and no automaton gets stuck).

Synchronous multiparty compatibility is reminiscent of a strong form of existential boundedness. Among the existing sub-classes of communicating automata (see [46] for a survey), existentially k-bounded communicating automata [22] stand out because they can be model-checked [8,21] and they restrict the model in a natural way: any execution can be rescheduled such that the number of pending messages *that can be received* is bounded by k. However, existential boundedness is generally *undecidable* [22], even for a fixed bound k. This shortcoming makes it impossible to know when theoretical results are applicable.

To address the limitation of smc and the shortcoming of existential boundedness, we propose a (decidable) sufficient condition for existential boundedness, called k*-exhaustivity*, which serves as a basis for a wider notion of new compatibility, called <sup>k</sup>*-multiparty compatibility* (k-mc) where <sup>k</sup> <sup>P</sup> <sup>N</sup>ą<sup>0</sup> is a bound on the number of pending messages in each channel. A system is <sup>k</sup>-mc when it is (i) k*-exhaustive*, i.e., all k-reachable send actions are enabled within the bound, and (ii) k*-safe*, i.e., within the bound k, all sent messages can be received and each automaton can always eventually progress. For example, the system in Fig. 1 is kmultiparty compatible for any <sup>k</sup> <sup>P</sup> <sup>N</sup>ą<sup>0</sup>, hence it does not lead to communication

Fig. 1. Client-Server-Logger example.

errors, see Theorem 1. The <sup>k</sup>-mc condition is a natural constraint for real-world systems. Indeed any finite-state system is k-exhaustive (for k sufficiently large), while any system that is not k-exhaustive (resp. k-safe) for any k is unlikely to work correctly. Furthermore, we show that if a system of csa validates <sup>k</sup>exhaustivity, then each automaton locally behaves equivalently under any bound greater than or equal to k, a property that we call *local bound-agnosticity*. We give a *sound and complete* characterisation of <sup>k</sup>-exhaustivity for csa in terms of local bound-agnosticity, see Theorem 3. Additionally, we show that the complexity of checking <sup>k</sup>-mc is pspace-complete (i.e., no higher than related algorithms) and we demonstrate empirically that its cost can be mitigated through (sound and complete) partial order reduction.

In this paper, we consider *communicating session automata* (csa), which cover the most common form of asynchronous multiparty session types [15] (see *Remark* 3), and have been used as a basis to study properties and extensions of session types [6,7,18,30,31,41,42,47,49,50]. More precisely, csa are deterministic automata, whose every state is either sending (internal choice), receiving (external choice), or final. We focus on csa that preserve the intent of internal and external choices from session types. In these csa, whenever an automaton is in a sending state, it can fire any transition, no matter whether channels are bounded; when it is in a receiving state then at most one action must be enabled.

*Synopsis.* In Sect. 2, we give the necessary background on communicating automata and their properties, and introduce the notions of output/input bound independence which guarantee that internal/external choices are preserved in bounded semantics. In Sect. 3, we introduce the definition of k-multiparty compatibility (k-mc) and show that <sup>k</sup>-mc systems are safe for systems which validate the bound independence properties. In Sect. 4, we formally relate existential boundedness [22,35], synchronisability [9], and k-exhaustivity. In Sect. 5 we present an implementation (using partial order reduction) and an experimental evaluation of our theory. We discuss related works in Sect. 6 and conclude in Sect. 7.

See [43] for a full version of this paper (including proofs and additional examples). Our implementation and benchmark data are available online [33].

#### 2 Communicating Automata and Bound Independence

This section introduces notations and definitions of communicating automata (following [12,39]), as well as the notion of output (resp. input) bound independence which enforces the intent of internal (resp. external) choice in csa.

Fix a finite set P of *participants* (ranged over by p, q, r, s, etc.) and a finite alphabet <sup>Σ</sup>. The set of *channels* is <sup>C</sup> def " {pq | p, q P P and p -" q}, <sup>A</sup> def " C ˆ {!, ?} ˆ Σ is the set of *actions* (ranged over by ), Σ<sup>∗</sup> (resp. A∗) is the set of finite words on Σ (resp. A). Let w range over Σ∗, and φ, ψ range over A∗. Also, (P/ Σ ∪ A) is the empty word, |w| denotes the length of w, and w · w is the concatenation of w and w (these notations are overloaded for words in A∗).

Definition 1 (Communicating automaton). *A* communicating automaton *is a finite transition system given by a triple* M " (Q, q0, δ) *where* Q *is a finite set of* states*,* q<sup>0</sup> P Q *is the initial state, and* δ Ď QˆAˆQ *is a set of* transitions*.*

The transitions of a communicating automaton are labelled by actions in A of the form sr!*a*, representing the *emission* of message a from participant s to r, or sr?*a* representing the *reception* of a by r. Define *subj*(pq!*a*) " *subj*(qp?*a*) " p, *obj*(pq!*a*) " *obj*(qp?*a*) " q, and *chan*(pq!*a*) " *chan*(pq?*a*) " pq. The projection of onto p is defined as πp() " if *subj*() " p and πp() " otherwise. Let † range over {!, ?}, we define: π† pq(pq † *<sup>a</sup>*) " *<sup>a</sup>* and <sup>π</sup>†- pq(sr † *a*) " if either pq -" sr or † -" † . We extend these definitions to sequences of actions in the natural way.

A state q P Q with no outgoing transition is *final*; q is *sending* (resp. *receiving*) if it is not final and all its outgoing transitions are labelled by send (resp. receive) actions, and q is *mixed* otherwise. M " (Q, q0, δ) is *deterministic* if @(q, , q ),(q, , q) P δ : " "⇒ q " q. M " (Q, q0, δ) is *send* (resp. *receive*) *directed* if for all sending (resp. receiving) q P Q and (q, , q ),(q, , q) P δ : *obj*() " *obj*( ). M is *directed* if it is send and receive directed.

*Remark 1.* In this paper, we consider only deterministic communicating automata without mixed states, and call them *Communicating Session Automata* (csa). We discuss possible extensions of our results beyond this class in Sect. 7.

Definition 2 (System). *Given a communicating automaton* M<sup>p</sup> " (Qp, q<sup>0</sup>p, δp) *for each* p P P*, the tuple* S " (Mp)<sup>p</sup>P<sup>P</sup> *is a* system*. A* configuration *of* S *is a pair* s " (*q*; *w*) *where q* " (qp)<sup>p</sup>P<sup>P</sup> *with* q<sup>p</sup> P Q<sup>p</sup> *and where w* " (wpq)pqP<sup>C</sup> *with* wpq P Σ∗*; component q is the* control state *and* q<sup>p</sup> P Q<sup>p</sup> *is the* local state *of automaton* Mp*. The* initial configuration *of* S *is* s<sup>0</sup> " (*q***0**; *-*) *where q***<sup>0</sup>** " (q<sup>0</sup>p)<sup>p</sup>P<sup>P</sup> *and we write for the* |C|*-tuple* ( , . . . , )*.*

Hereafter, we fix a communicating session automaton M<sup>p</sup> " (Qp, q<sup>0</sup>p, δp) for each p P P and let S " (Mp)<sup>p</sup>P<sup>P</sup> be the corresponding system whose initial configuration is s0. For each p P P, we assume that @(q, , q ) P δ<sup>p</sup> : *subj*() " p. We assume that the components of a configuration are named consistently, e.g., for s " (*q-* ; *w-* ), we implicitly assume that *q-* " (q <sup>p</sup>)<sup>p</sup>P<sup>P</sup> and *w-* " (w pq)pqPC.

Definition 3 (Reachable configuration). *Configuration* s " (*q* ; *w* ) *is* reachable *from configuration* s " (*q*; *w*) *by* firing transition *, written* s - −→ s *(or* s −→ s *when is not relevant), if there are* s, r P P *and* a P Σ *such that either:*


*Remark 2.* Hereafter, we assume that any bound <sup>k</sup> is finite and <sup>k</sup> <sup>P</sup> <sup>N</sup>ą0.

We write −→<sup>∗</sup> for the reflexive and transitive closure of −→. Configuration (*q*; *w*) is k-bounded if @pq P C : |wpq| ď k. We write s<sup>1</sup> -1···*<sup>n</sup>* −−−−→ sn+1 when s1 -<sup>1</sup> −→ <sup>s</sup><sup>2</sup> ··· <sup>s</sup><sup>n</sup> *<sup>n</sup>* −→ sn+1, for some s2,...,s<sup>n</sup> (with n ě 0); and say that the *execution* <sup>1</sup> ··· <sup>n</sup> is k*-bounded from* s<sup>1</sup> if @1 ď i ď n+1 : s<sup>i</sup> is k-bounded. Given φ P A<sup>∗</sup>, we write p P/ φ iff φ " φ<sup>0</sup> · · φ<sup>1</sup> "⇒ *subj*() -" p. We write s φ −→<sup>k</sup> s if s is reachable with a k-bounded execution φ from s. The set of *reachable configurations of* S is *RS*(S) " {s | s<sup>0</sup> −→<sup>∗</sup>s}. The k*-reachability set of* S is the largest subset *RS* <sup>k</sup>(S) of *RS*(S) within which each configuration s can be reached by a k-bounded execution from s0.

Definition 4 streamlines notions of safety from previous works [6,12,18,39] (absence of deadlocks, orphan messages, and unspecified receptions).

Definition 4 (k-Safety). S *is* k-safe *if the following holds* @(*q*; *w*) P *RS* <sup>k</sup>(S)*:*


*We say that* S *is* safe *if it validates the unbounded version of* k*-safety (*8*-safe).*

Property (er), called *eventual reception*, requires that any sent message can always eventually be received (i.e., if *a* is the head of a queue then there must be an execution that consumes *<sup>a</sup>*), and Property (pg), called *progress*, requires that any automaton in a receiving state can eventually make a move (i.e., it can always eventually receive an *expected* message).

We say that a configuration s is *stable* iff s " (*q*; *-*), i.e., all its queues are empty. Next, we define the *stable property* for systems of communicating automata, following the definition from [18].

Definition 5 (Stable). <sup>S</sup> *has the* stable property *(* sp*) if* @<sup>s</sup> <sup>P</sup> *RS*(S) : <sup>D</sup>(*q*; *-*) P *RS*(S) : s −→<sup>∗</sup>(*q*; *-*)*.*

A system has the stable property if it is possible to reach a stable configuration from any reachable configuration. This property is called *deadlock-free* in [22]. The stable property implies the eventual reception property, but not safety (e.g., an automaton may be waiting for an input in a stable configuration, see Example 2), and safety does not imply the stable property, see Example 4.

*Example 2.* The following system has the stable property, but it is not safe.

$$M\_{\texttt{a}} \colon \underbrace{\mathsf{p}^{\texttt{q}!a} \nrightarrow\mathsf{p}^{\texttt{q}!b}}\_{\texttt{o}\hookrightarrow\cdots\rightarrow\mathsf{p}} \quad \xrightarrow[\mathsf{o}\hookrightarrow\mathsf{p}]{\texttt{b}^{\texttt{a}}} \xleftarrow{\mathsf{p}\mathsf{q}^{?a}} \xleftarrow{\mathsf{p}\mathsf{q}^{?b}} \xleftarrow{\mathsf{q}\mathsf{r}!c} \quad \quad M\_{\texttt{r}} \colon \begin{array}{c} M\_{\texttt{r}} \colon \begin{array}{c} \mathsf{q}\mathsf{r}?c \end{array} \end{array}$$

Next, we define two properties related to *bound independence*. They specify classes of csa whose branching behaviours are not affected by channel bounds.

Definition 6 (k-obi). <sup>S</sup> *is* <sup>k</sup>-output bound independent *(*k*-*obi*), if* @<sup>s</sup> " (*q*; *w*) P *RS* <sup>k</sup>(S) *and* @p P P*, if* s pq!*a* −−→k*, then* @(qp, pr!*b*, q <sup>p</sup>) P δ<sup>p</sup> : s pr!*b* −−→k*.*

Fig. 2. Example of a *non*-ibi and *non*-safe system.

Definition 7 (k-ibi). <sup>S</sup> *is* <sup>k</sup>-input bound independent *(*k*-*ibi*), if* @<sup>s</sup> " (*q*; *<sup>w</sup>*) <sup>P</sup> *RS* <sup>k</sup>(S) *and* @p P P*, if* s qp?*a* −−−→k*, then* @ P A : s - −→<sup>k</sup> ^ *subj*() " p "⇒ " qp?*a.*

If <sup>S</sup> is <sup>k</sup>-obi, then any automaton that reaches a sending state is able to fire any of its available transitions, i.e., sending states model *internal choices* which are not constrained by bounds greater than or equal to k. Note that the unbounded version of <sup>k</sup>-obi (<sup>k</sup> " 8) is trivially satisfied for any system due to unbounded asynchrony. If <sup>S</sup> is <sup>k</sup>-ibi, then any automaton that reaches a receiving state is able to fire at most one transition, i.e., receiving states model *external choices* where the behaviour of the receiving automaton is controlled exclusively by its environment. We write ibi for the unbounded version of <sup>k</sup>-ibi (<sup>k</sup> " 8).

Checking the ibi property is generally undecidable. However, systems consisting of (send and receive) *directed* automata are trivially <sup>k</sup>-ibi and <sup>k</sup>-obi for all <sup>k</sup>, this subclass of csa was referred to as *basic* in [18]. We introduce larger decidable approximations of ibi with Definitions <sup>10</sup> and 11.

Proposition 1. *(1) If* <sup>S</sup> *is send directed, then* <sup>S</sup> *is* <sup>k</sup>*-*obi *for all* <sup>k</sup> <sup>P</sup> <sup>N</sup>ą<sup>0</sup>*. (2) If* <sup>S</sup> *is receive directed, then* <sup>S</sup> *is* ibi *(and* <sup>k</sup>*-*ibi *for all* <sup>k</sup> <sup>P</sup> <sup>N</sup>ą<sup>0</sup>*).*

*Remark 3.* csa validating <sup>k</sup>-obi and ibi strictly include the most common forms of asynchronous multiparty session types, e.g., the directed csa of [18], and systems obtained by projecting Scribble specifications (global types) which need to be receive directed (this is called "consistent external choice subjects" in [31]) and which validate <sup>1</sup>-obi by construction since they are projections of synchronous specifications where choices must be located at a unique sender.

#### 3 Bounded Compatibility for csa

In this section, we introduce <sup>k</sup>*-multiparty compatibility* (k-mc) and study its properties wrt. Safety of communicating session automata (csa) which are <sup>k</sup>-obi and ibi. Then, we soundly and completely characterise <sup>k</sup>-exhaustivity in terms of local bound-agnosticity, a property which guarantees that communicating automata behave equivalently under any bound greater than or equal to k.

#### 3.1 Multiparty Compatibility

The definition of <sup>k</sup>-mc is divided in two parts: (i) <sup>k</sup>*-exhaustivity* guarantees that the set of k-reachable configurations contains enough information to make a sound decision wrt. safety of the system; and (ii) k*-safety* (Definition 4) guarantees that a subset of all possible executions is free of any communication errors. Next, we define k-exhaustivity, then k-multiparty compatibility. Intuitively, a system is k-exhaustive if for all k-reachable configurations, whenever a send action is enabled, then it can be fired within a k-bounded execution.

$$M\_{\mathsf{p}} \colon \bigotimes\_{\mathsf{q}\mathfrak{p}\mathbin{\mathcal{I}}\mathfrak{b}} \bigwedge\_{\mathsf{pq}\mathfrak{q}\mathfrak{a}}^{\mathsf{pq}\mathfrak{a}} \bigwedge\_{\mathsf{pq}\mathfrak{a}\mathfrak{a}} \bigwedge\_{\mathsf{pq}\mathfrak{a}\mathfrak{a}}^{\mathsf{i}} N\_{\mathsf{q}} \colon \bigwedge\_{\mathsf{pq}\mathfrak{i}a} \bigwedge\_{\mathsf{pq}\mathfrak{i}a}^{\mathsf{i}\mathfrak{q}\mathfrak{a}\mathfrak{b}} N\_{\mathsf{q}}' \colon \bigotimes\_{\mathsf{pq}\mathfrak{i}a}^{\mathsf{i}\mathfrak{q}\mathfrak{p}\mathfrak{b}} \bigwedge\_{\mathsf{pq}\mathfrak{i}a}^{\mathsf{i}\mathfrak{q}\mathfrak{a}\mathfrak{b}}$$

Fig. 3. (Mp, Mq) is non-exhaustive, (Mp, Nq) is 1-exhaustive, (Mp, N- <sup>q</sup>) is 2-exhaustive.

Definition 8 (k-Exhaustivity). S *is* k-exhaustive *if* @(*q*; *w*) P *RS* <sup>k</sup>(S) *and* @p P P*, if* q<sup>p</sup> *is* sending*, then* @(qp, , q <sup>p</sup>) <sup>P</sup> <sup>δ</sup><sup>p</sup> : <sup>D</sup><sup>φ</sup> <sup>P</sup> <sup>A</sup><sup>∗</sup> : (*q*; *<sup>w</sup>*) <sup>φ</sup> −→<sup>k</sup> - −→<sup>k</sup> ^p P/ φ.

Definition 9 (k-Multiparty compatibility). S *is* k-multiparty compatible *(*k*-*mc*) if it is* <sup>k</sup>*-safe and* <sup>k</sup>*-exhaustive.*

Definition 9 is a natural extension of the definitions of *synchronous* multiparty compatibility given in [18, Definition 4.2] and [6, Definition 4]. The common key requirements are that *every send* action must be matched by a receive action (i.e., send actions are universally quantified), while *at least one receive* action must find a matching send action (i.e., receive actions are existentially quantified). Here, the universal check on send actions is done via the eventual reception property (er) and the <sup>k</sup>-exhaustivity condition; while the existential check on receive actions is dealt with by the progress property (pg).

Whenever systems are <sup>k</sup>-obi and ibi, then <sup>k</sup>-exhaustivity implies that <sup>k</sup>bounded executions are sufficient to make a sound decision wrt. safety. This is not necessarily the case for systems outside of this class, see Examples 3 and 5.

*Example 3.* The system (Mp, Mq, Mr) in Fig. <sup>2</sup> is <sup>k</sup>-obi for any <sup>k</sup>, but not ibi (it is <sup>1</sup>-ibi but not <sup>k</sup>-ibi for any <sup>k</sup> <sup>ě</sup> <sup>2</sup>). When executing with a bound strictly greater than 1, there is a configuration where M<sup>q</sup> is in its initial state and *both* its *receive* transitions are enabled. The system is 1-safe and 1-exhaustive (hence <sup>1</sup>-mc) but it is *not* <sup>2</sup>-exhaustive nor <sup>2</sup>-safe. By constraining the automata to execute with a channel bound of 1, the left branch of M<sup>p</sup> is prevented to execute together with the right branch of Mq. Thus, the fact that the *y* messages are not received in this case remains invisible in 1-bounded executions. This example can be easily extended so that it is n-exhaustive (resp. safe) but not n+1-exhaustive (resp. safe) by sending/receiving n+1 *a<sup>i</sup>* messages.

*Example 4.* The system in Fig. <sup>1</sup> is *directed* and <sup>1</sup>-mc. The system (Mp, Mq) in Fig. <sup>3</sup> is safe but *not* <sup>k</sup>-mc for any finite <sup>k</sup> <sup>P</sup> <sup>N</sup>ą0. Indeed, for any execution of this system, at least one of the queues grows arbitrarily large. The system (Mp, Nq) is <sup>1</sup>-mc while the system (Mp, N <sup>q</sup>) is *not* <sup>1</sup>-mc but it is <sup>2</sup>-mc.

$$M\_{\mathtt{p}}: \underbrace{\mathop{\scriptstyle\mathtt{ps}}\nolimits\hbox{\rkern{1.0pt}{\$\mathtt{ps}\$^{\mathtt{rg}}\$}}\nolimits\hbox{\rightsquigarrow}\math{\begin{subarray}{c}\mathtt{ps}\mathbin{\mathtt{ps}}\mathbin{\mathtt{ps}}\mathbin{\mathtt{ps}}\mathbin{\mathtt{ps}}\\\mathtt{pq}\mathbin{\mathtt{ps}}\mathbin{\mathtt{ps}}\mathbin{\mathtt{ps}}\end{subarray}}\maps{\mathop{\scriptstyle\mathtt{nr}}\nolimits\hbox{\rkern{1.0pt}{\$\mathtt{ps}\$^{\mathtt{rg}}\$}}}\_{\mathtt{pq}\mathbin{\mathtt{ps}}\mathbin{\mathtt{no}}}M\_{\mathtt{q}}:\mathop{\scriptstyle\mathtt{ps}}\mathop{\scriptstyle\mathtt{q}}\mathring{\mathtt{Q}}\_{\mathtt{r}}:\\M\_{\mathtt{r}}:\mathop{\scriptstyle\mathtt{ps}}\mathring{\mathtt{C}}\mathop{\scriptstyle\mathtt{ns}}\mathop{\scriptstyle\mathtt{vs}}\mathring{\mathtt{C}}\mathop{\scriptstyle}\hbox{\rkern{1.0pt}{\$\mathtt{ps}\$^{\mathtt{rg}}\$}}\hbox{\rightsquigarrow}\mathop{\scriptstyle\mathtt{ps}}\mathop{\scriptstyle\mathtt{On}}\mathop{\scriptstyle\mathtt{On}}\hbox{\rkern{1.0pt}{\$\mathtt{ps}^{\mathtt{T}}\$}}\up{\rightsquigarrow}\mathop{\scriptstyle\mathtt{no}}\mathop{\scriptstyle\mathtt{On}}\hbox{\rkern{1.0pt}{\$\mathtt{ps}^{\mathtt{T}}\$}}\up{\rightsquigarrow}\mathop{\scriptstyle\mathtt{On}}\,\boldsymbol{\mathtt{On}}\,\boldsymbol{\mathtt{On}}\mathop{\scriptstyle}\textsc{\mathtt{On}}\mathop{\scriptstyle}\textsc{\mathtt{On}}\mathop{\scriptstyle}\textsc{\mathtt{On}}\mathop{\scriptstyle}\textsc{\mathtt{On}}\mathop{\scriptstyle}\textsc{\mathtt{On}}\mathop{\scriptstyle}\textsc{\mathtt{On}}\mathop{\scriptstyle}\textsc{\mathtt{On}}\mathop{\scriptstyle}\textsc{\mathtt{On}}\mathop{\scriptstyle}}\mathbf{\mathbin{\mathtt{On}}\mathop{\scriptstyle}}$$

Fig. 4. Example of a system which is not <sup>1</sup>-obi.

*Example 5.* The system in Fig. <sup>4</sup> (without the dotted transition) is <sup>1</sup>-mc, but not <sup>2</sup>-safe; it is not <sup>1</sup>-obi but it is <sup>2</sup>-obi. In <sup>1</sup>-bounded executions, <sup>M</sup><sup>r</sup> can execute rs!*b* · rp!*z* , but it cannot fire rs!*b* · rs!*a* (queue rs is full), which violates the <sup>1</sup>-obi property. The system with the dotted transition is not <sup>1</sup>-obi, but it is <sup>2</sup>-obi and <sup>k</sup>-mc for any <sup>k</sup> <sup>ě</sup> <sup>1</sup>. Both systems are receive directed, hence ibi.

Theorem 1. *If* <sup>S</sup> *is* <sup>k</sup>*-*obi*,* ibi*, and* <sup>k</sup>*-*mc*, then it is safe.*

*Remark 4.* It is undecidable whether there exists a bound k for which an arbitrary system is <sup>k</sup>-mc. This is a consequence of the Turing completeness of communicating (session) automata [10,20,42].

Although the ibi property is generally undecidable, it is possible to identify sound approximations, as we show below. We adapt the dependency relation from [39] and say that action depends on from s " (*q*; *w*), written s \$ ă , iff *subj*() " *subj*( ) \_ (*chan*() " *chan*( ) ^ w*chan*(-) " ). Action depends on in φ from s, written s \$ ă<sup>φ</sup> , if the following holds:

$$s \vdash \ell \prec\_{\phi} \ell' \iff \begin{cases} (s \vdash \ell < \ell'' \land s \vdash \ell'' \prec\_{\psi} \ell') \lor s \vdash \ell \prec\_{\psi} \ell' & \text{if } \phi = \ell'' \cdot \psi\\ s \vdash \ell < \ell' & \text{otherwise} \end{cases}$$

Definition 10. <sup>S</sup> *is* <sup>k</sup>*-*chained input bound independent *(*k*-*cibi*) if* @<sup>s</sup> " (*q*; *w*) P *RS* <sup>k</sup>(S) *and* @p P P*, if* s qp?*a* −−−→<sup>k</sup> s *, then* @(qp, sp?*b*, q <sup>p</sup>) P δ<sup>p</sup> : s -" q "⇒ ¬(s sp?*b* −−−→k) ^ (@<sup>φ</sup> <sup>P</sup> <sup>A</sup><sup>∗</sup> : <sup>s</sup> <sup>φ</sup> −→<sup>k</sup> sp!*b* −−→<sup>k</sup> "⇒ s \$ qp?*a* ă<sup>φ</sup> sp!*b*)*.*

Definition 11. <sup>S</sup> *is* <sup>k</sup>-strong input bound independent *(*k*-*sibi*) if* @<sup>s</sup> " (*q*; *<sup>w</sup>*) <sup>P</sup> *RS* <sup>k</sup>(S) *and* @p P P*, if* s qp?*a* −−−→<sup>k</sup> s *, then* @(qp, sp?*b*, q <sup>p</sup>) P δ<sup>p</sup> : s -" q "⇒ ¬(s sp?*b* −−−→<sup>k</sup> \_ s −→<sup>k</sup> <sup>∗</sup> sp!*<sup>b</sup>* −−→k)*.*

Definition 10 requires that whenever p can fire a receive action, at most one of its receive actions is enabled at s, and no other receive transition from q<sup>p</sup> will be enabled until p has made a move. This is due to the existence of a dependency chain between the reception of a message (qp?*a*) and the matching send of another possible reception (sp!*b*). Property <sup>k</sup>-sibi (Definition 11) is a stronger version of <sup>k</sup>-cibi, which can be checked more efficiently.

Lemma 1. *If* <sup>S</sup> *is* <sup>k</sup>*-*obi*,* <sup>k</sup>*-*cibi *(resp.* <sup>k</sup>*-*sibi*) and* <sup>k</sup>*-exhaustive, then it is* ibi*.*

The decidability of <sup>k</sup>-obi, <sup>k</sup>-ibi, <sup>k</sup>-sibi, <sup>k</sup>-cibi, and <sup>k</sup>-mc is straightforward since both *RS* <sup>k</sup>(S) (which has an exponential number of states wrt. k) and −→<sup>k</sup> are finite, given a finite k. Theorem 2 states the space complexity of the procedures, except for <sup>k</sup>-cibi for which a complexity class is yet to be determined. We show that the properties are pspace by reducing to an instance of the reachability problem over a transition system built following the construction of Bollig et al. [8, Theorem 6.3]. The rest of the proof follows from similar arguments in Genest et al. [22, Proposition 5.5] and Bouajjani et al. [9, Theorem 3].

Theorem 2. *The problems of checking the* <sup>k</sup>*-*obi*,* <sup>k</sup>*-*ibi*,* <sup>k</sup>*-*sibi*,* <sup>k</sup>*-safety, and* <sup>k</sup>*-exhaustivity properties are all decidable and* pspace*-complete (with* <sup>k</sup> <sup>P</sup> <sup>N</sup>ą<sup>0</sup> *given in unary). The problem of checking the* <sup>k</sup>*-*cibi *property is decidable.*

#### 3.2 Local Bound-Agnosticity

We introduce local bound-agnosticity and show that it fully characterises kexhaustive systems. Local bound-agnosticity guarantees that each communicating automaton behave in the same manner for any bound greater than or equal to some k. Therefore such systems may be executed transparently under a bounded semantics (a communication model available in Go and Rust).

Definition 12 (Transition system). *The* k*-bounded transition system of* S *is the labelled transition system (LTS) TS* <sup>k</sup>(S) " (N,s0, Δ) *such that* N " *RS* <sup>k</sup>(S)*,* s<sup>0</sup> *is the initial configuration of* S*,* Δ Ď NˆAˆN *is the transition relation, and* (s, , s ) P Δ *if and only if* s - −→<sup>k</sup> s *.*

Definition 13 (Projection). *Let* T *be an LTS over* A*. The* projection *of* T *onto* p*, written* π <sup>p</sup>(T )*, is obtained by replacing each label in* T *by* πp()*.*

Recall that the projection of action , written πp(), is defined in Sect. 2. The automaton π <sup>p</sup>(*TS* <sup>k</sup>(S)) is essentially the *local* behaviour of participant p within the transition system *TS* <sup>k</sup>(S). When each automaton in a system S behaves equivalently for any bound greater than or equal to some k, we say that S is locally bound-agnostic. Formally, S is *locally bound-agnostic for* k when π <sup>p</sup>(*TS* <sup>k</sup>(S)) and π <sup>p</sup>(*TS* <sup>n</sup>(S)) are weakly bisimilar (≈) for each participant <sup>p</sup> and any <sup>n</sup> <sup>ě</sup> <sup>k</sup>. For <sup>k</sup>-obi and ibi systems, local bound-agnosticity is a *necessary and sufficient* condition for k-exhaustivity, as stated in Theorem 3 and Corollary 1.

#### Theorem 3. *Let* S *be a system.*

*(1) If* <sup>D</sup><sup>k</sup> <sup>P</sup> <sup>N</sup>ą<sup>0</sup> : @<sup>p</sup> <sup>P</sup> <sup>P</sup> : <sup>π</sup> <sup>p</sup>(*TS* <sup>k</sup>(S)) <sup>≈</sup> <sup>π</sup> <sup>p</sup>(*TS* <sup>k</sup>+1(S))*, then* S *is* k*-exhaustive. (2) If* <sup>S</sup> *is* <sup>k</sup>*-*obi*,* ibi*, and* <sup>k</sup>*-exhaustive, then* @<sup>p</sup> <sup>P</sup> <sup>P</sup> : <sup>π</sup> <sup>p</sup>(*TS* <sup>k</sup>(S)) ≈ π <sup>p</sup>(*TS* <sup>k</sup>+1(S))*.*

Corollary 1. *Let* <sup>S</sup> *be* <sup>k</sup>*-*obi *and* ibi *s.t.* @<sup>p</sup> <sup>P</sup> <sup>P</sup> : <sup>π</sup> <sup>p</sup>(*TS* <sup>k</sup>(S)) <sup>≈</sup> <sup>π</sup> <sup>p</sup>(*TS* <sup>k</sup>+1(S))*, then* S *is locally bound-agnostic for* k*.*

Theorem <sup>3</sup> (1) is reminiscent of the (pspace-complete) checking procedure for existentially bounded systems with the stable property [22] (an *undecidable* property). Recall that k-exhaustivity is not sufficient to guarantee safety, see Examples 3 and 5. We give an effective procedure (based on partial order reduction) to check k-exhaustivity and related properties in [43].

Fig. 5. Relations between k-exhaustivity, existential k-boundedness, and k-synchronisability in <sup>k</sup>-obi and ibi csa (the circled numbers refer to Table 1).

#### 4 Existentially Bounded and Synchronisable Automata

#### 4.1 Kuske and Muscholl's Existential Boundedness

Existentially bounded communicating automata [21,22,35] are a class of communicating automata whose executions can always be scheduled in such a way that the number of pending messages is bounded by a given value. Traditionally, existentially bounded communicating automata are defined on communicating automata that feature (local) accepting states and in terms of *accepting runs*. An accepting run is an execution (starting from s0) which terminates in a configuration (*q*; *w*) where each q<sup>p</sup> is a local accepting state. In our setting, we simply consider that every local state q<sup>p</sup> is an accepting state, hence any execution φ starting from s<sup>0</sup> is an accepting run. We first study existential boundedness as defined in [35] as it matches more closely k-exhaustivity, we study the "classical" definition of existential boundedness [22] in Sect. 4.2.

Following [35], we say that an execution φ P A<sup>∗</sup> is *valid* if for any prefix ψ of <sup>φ</sup> and any channel pq <sup>P</sup> <sup>C</sup>, we have that <sup>π</sup>? pq(ψ) is a prefix of π! pq(ψ), i.e., an execution is valid if it models the fifo semantics of communicating automata.

Definition 14 (Causal equivalence [35]). *Given* φ, ψ P A<sup>∗</sup>*, we define:* φ ψ iff φ *and* ψ *are* valid *executions and* @p P P : πp(φ) " πp(ψ)*. We write* [φ] *for the equivalence class of* φ *wrt.* -*.*

Definition 15 (Existential boundedness [35]). *We say that a valid execution* φ *is* k-match-bounded *if, for every prefix* ψ *of* φ *the difference between the number of* matched *events of type* pq! *and those of type* pq? *is bounded by* k*, i.e., min*{|π! pq(ψ)|, <sup>|</sup>π? pq(φ)|} − |π? pq(ψ)| ď k*.*

*Write* A<sup>∗</sup>|<sup>k</sup> *for the set of* k*-match-bounded words. An execution* φ *is* existentially k*-bounded if* [φ]- X A<sup>∗</sup>|<sup>k</sup> -" <sup>∅</sup>*. A system* <sup>S</sup> *is existentially* <sup>k</sup>*-bounded, written* <sup>D</sup>k*-bounded, if each execution in* {φ | Ds : s<sup>0</sup> φ −→s} *is existentially* k*-bounded.*

*Example 6.* Consider Fig. 3. (Mp, Mq) is *not* existentially k-bounded, for any k: at least one of the queues must grow infinitely for the system to progress. Systems (Mp, Nq) and (Mp, N <sup>q</sup>) are existentially bounded since any of their executions can be scheduled to an --equivalent execution which is 2-match-bounded.

The relationship between k-exhaustivity and existential boundedness is stated in Theorem <sup>4</sup> and illustrated in Fig. <sup>5</sup> for <sup>k</sup>-obi and ibi csa, where smc refers to synchronous multiparty compatibility [18, Definition 4.2]. The circled numbers in the figure refer to key examples summarised in Table 1. The strict inclusion of k-exhaustivity in existential k-boundedness is due to systems that do not have the eventual reception property, see Example 7.

*Example 7.* The system below is D-1-bounded but is *not* k-exhaustive for any k.

$$M\_{\mathfrak{p}} \colon \begin{array}{c} \mathsf{L} \mathsf{S} \mathsf{D} \mathsf{S} \mathsf{T} \mathsf{s} \end{array} \quad \begin{array}{c} M\_{\mathfrak{a}} : \ \mathsf{S} \mathsf{C} \mathsf{S} \end{array} \quad \begin{array}{c} \mathsf{s} \mathsf{T} \mathsf{a} \\ \mathsf{S} \mathsf{T} \mathsf{s} \end{array} \quad \begin{array}{c} M\_{\mathfrak{r}} : \ \mathsf{S} \mathsf{C} \mathsf{S} \mathsf{s} \end{array} \quad \begin{array}{c} \mathsf{S} \mathsf{D} \mathsf{S} \mathsf{T} \mathsf{a} \\ \mathsf{S} \mathsf{S} \mathsf{b} \end{array}$$

For any k, the channel sp eventually gets full and the send action sp!*b* can no longer be fired; hence it does *not* satisfy k-exhaustivity. Note that each execution can be reordered into a 1-match-bounded execution (the *b*'s are never matched).

Theorem 4. *(1) If* <sup>S</sup> *is* <sup>k</sup>*-*obi*,* ibi*, and* <sup>k</sup>*-exhaustive, then it is* <sup>D</sup>*-*k*-bounded. (2) If* S *is* D*-*k*-bounded and satisfies eventual reception, then it is* k*-exhaustive.*

#### 4.2 Existentially Stable Bounded Communicating Automata

The "classical" definition of existentially bounded communicating automata as found in [22] differs slightly from Definition 15, as it relies on a different notion of accepting runs, see [22, page 4]. Assuming that all local states are accepting, we adapt their definition as follows: a *stable accepting run* is an execution φ starting from s<sup>0</sup> which terminates in a *stable* configuration.

Definition 16 (Existential stable boundedness [22]). *A system* S *is* existentially stable k-bounded*, written* D*S-*k*-bounded, if for each execution* φ *in* {φ | D(*q*; *-*) P *RS*(S) : s<sup>0</sup> φ −→ (*q*; *-*)} *there is* ψ *such that* s<sup>0</sup> ψ −→<sup>k</sup> *with* φ ψ*.*

A system is existentially stable k-bounded if each of its executions leading to a *stable* configuration can be re-ordered into a k-bounded execution (from s0).

Theorem 5. *(1) If* S *is existentially* k*-bounded, then it is existentially* stable k*-bounded. (2) If* S *is existentially* stable k*-bounded and has the stable property, then it is existentially* k*-bounded.*

We illustrate the relationship between existentially stable bounded communicating automata and the other classes in Fig. 5. The example below further illustrates the strictness of the inclusions, see Table 1 for a summary.

*Example 8.* Consider the systems in Fig. 3. (Mp, Mq) and (Mp, N <sup>q</sup>) are (trivially) existentially stable 1-bounded since none of their (non-empty) executions terminate in a stable configuration. The system (Mp, Nq) is existentially stable 2-bounded since each of its executions can be re-ordered into a 2-bounded one. The system in Example 7 is (trivially) DS-1-bounded: none of its (non-empty) executions terminate in a stable configuration (the *b*'s are never received).

Theorem 6. *Let* S *be an* D*(S)-*k*-bounded system with the stable property, then it is* k*-exhaustive.*

Table 1. Properties for key examples, where direct. stands for directed, obi for <sup>k</sup>-obi, sibi for <sup>k</sup>-sibi, er for eventual reception property, sp for stable property, exh. for <sup>k</sup>exhaustive, <sup>D</sup>(S)-b for <sup>D</sup>(S)-bounded, and syn. for <sup>n</sup>-synchronisable (for some <sup>n</sup> <sup>P</sup> <sup>N</sup>ą<sup>0</sup>).


#### 4.3 Synchronisable Communicating Session Automata

In this section, we study the relationship between synchronisability [9] and kexhaustivity via existential boundedness. Informally, communicating automata are synchronisable if each of their executions can be scheduled in such a way that it consists of sequences of "exchange phases", where each phase consists of a bounded number of send actions, followed by a sequence of receive actions. The original definition of k-synchronisable systems [9, Definition 1] is based on communicating automata with *mailbox* semantics, i.e., each automaton has one input queue. Here, we adapt the definition so that it matches our point-to-point semantics. We write A! for A X (C ˆ {!} ˆ Σ), and A? for A X (C ˆ {?} ˆ Σ).

Definition 17 (Synchronisability). *A valid execution* φ " φ<sup>1</sup> ··· φ<sup>n</sup> *is a* kexchange *if and only if: (1)* @1 ď i ď n : φ<sup>i</sup> P A<sup>∗</sup> ! · A<sup>∗</sup> ? ^ |φi| ď 2k*; and (2)* @pq <sup>P</sup> <sup>C</sup> : @<sup>1</sup> <sup>ď</sup> <sup>i</sup> <sup>ď</sup> <sup>n</sup> : <sup>π</sup>! pq(φi) -" π? pq(φi) "<sup>⇒</sup> @<sup>i</sup> <sup>ă</sup> <sup>j</sup> <sup>ď</sup> <sup>n</sup> : <sup>π</sup>? pq(φ<sup>j</sup> ) " *.*

*We write* A<sup>∗</sup> <sup>k</sup> *for the set of executions that are* k*-exchanges and say that an execution* φ *is* k-synchronisable *if* [φ]- X A<sup>∗</sup> <sup>k</sup> -" <sup>∅</sup>*. A system* <sup>S</sup> *is* <sup>k</sup>synchronisable *if each execution in* {φ | Ds : s<sup>0</sup> φ −→s} *is* k*-synchronisable.*

Table 2. Experimental evaluation. *|P|* is the number of participants, <sup>k</sup> is the bound, *<sup>|</sup>RTS<sup>|</sup>* is the number of transitions in the *reduced TS <sup>k</sup>*(S) (see [43]), direct. stands for directed, Time is the time taken to check all the properties shown in this table, and gmc is yes if the system is generalised multiparty compatible [39].


Condition (1) says that execution φ should be a sequence of an arbitrary number of send-receive phases, where each phase consists of at most 2k actions. Condition (2) says that if a message is not received in the phase in which it is sent, then it cannot be received in φ. Observe that the bound k is on the number of actions (over possibly different channels) in a phase rather than the number of pending messages in a given channel.

*Example 9.* The system below (left) is <sup>1</sup>-mc and <sup>D</sup>(S)-1-bounded, but it is *not* k-synchronisable for any k. The subsequences of send-receive actions in the - equivalent executions below are highlighted (right).

$$\begin{array}{c|c|c|c|c|c} M\_{\mathtt{p}}: & \mathtt{p}\texttt{q}\texttt{l}\texttt{a}\ \mathtt{q}\texttt{p}\texttt{c}\ \mathtt{q}\texttt{l}\texttt{b}\ \mathtt{q}\texttt{p}\texttt{?}\\ M\_{\mathtt{q}}: & \mathtt{q}\texttt{p}\texttt{l}\texttt{c}\ \mathtt{q}\texttt{p}\texttt{l}\ \mathtt{q}\texttt{q}\texttt{?}\\ \mathtt{b}\texttt{q}\texttt{int}\ \mathtt{0}\texttt{int}\ \mathtt{0}\texttt{a}\texttt{p}\texttt{?}\\ \mathtt{b}\texttt{int}\ \mathtt{0}\texttt{int}\end{array} \qquad \phi\_{2} = \underbrace{\mathtt{p}\texttt{q}\texttt{l}\boldsymbol{a}\cdot\texttt{q}\texttt{p}\texttt{l}\boldsymbol{c}\cdot\texttt{q}\texttt{p}\texttt{?}}\_{\texttt{\mathtt{q}}}\cdot\underbrace{\mathtt{q}\texttt{p}\texttt{l}\boldsymbol{d}\cdot\texttt{\mathtt{p}}\texttt{q}\texttt{?}\boldsymbol{a}\cdot\texttt{\mathtt{p}}\texttt{?}\boldsymbol{b}\cdot\texttt{\mathtt{q}}\texttt{?}\boldsymbol{c}\cdot\texttt{\mathtt{p}}\texttt{?}}\_{\texttt{\mathtt{q}}}\cdot\underbrace{\mathtt{q}\texttt{q}\texttt{l}\boldsymbol{b}\cdot\texttt{\mathtt{q}}\texttt{?}\boldsymbol{b}\texttt{\mathtt{q}}\texttt{?}\boldsymbol{b}}\_{\texttt{\mathtt{q}}}\\\underbrace{\mathtt{b}\texttt{q}\texttt{l}\boldsymbol{b}\texttt{\mathtt{q}}\texttt{?}\boldsymbol{b}\texttt{\mathtt{q}}\texttt{?}\boldsymbol{b}\\ \mathtt{b}\texttt{\mathtt{q}}\texttt{id}\texttt{ }\mathtt{q}\texttt{}\texttt{\mathtt{q}}\texttt{?}\boldsymbol{\end{array}}}\_{\texttt{\mathtt{q}}}$$

Execution φ<sup>1</sup> is 1-bounded for s0, but it is not a k-exchange since, e.g., *a* is received outside of the phase where it is sent. In φ2, message *d* is received outside of its sending phase. In the terminology of [9], this system is not k-synchronisable because there is a "*receive-send dependency*" between the exchange of message *c* and *b*, i.e., p must receive *c* before it sends *b*. Hence, there is no k-exchange that is --equivalent to φ<sup>1</sup> and φ2.

Theorem 7. *(1) If* S *is* k*-synchronisable, then it is* D*-*k*-bounded. (2) If* S *is* k*synchronisable and has the eventual reception property, then it is* k*-exhaustive.*

Figure <sup>5</sup> and Table <sup>1</sup> summarise the results of Sect. <sup>4</sup> wrt. <sup>k</sup>-obi and ibi csa. We note that any finite-state system is k-exhaustive (and D(S)-k-bounded) for sufficiently large k, while this does not hold for synchronisability, see Example 9.

#### 5 Experimental Evaluation

We have implemented our theory in a tool [33] which takes two inputs: (i) a system of communicating automata and (ii) a bound max. The tool iteratively checks whether the system validates the premises of Theorem 1, until it succeeds or reaches <sup>k</sup> " max. We note that the <sup>k</sup>-obi and ibi conditions are required for our soundness result (Theorem 1), but are orthogonal for checking <sup>k</sup>-mc. Each condition is checked on a *reduced bounded transition system*, called *RTS* <sup>k</sup>(S). Each verification procedure for these conditions is implemented in Haskell using a simple (depth-first-search based) reachability check on the paths of *RTS* <sup>k</sup>(S). We give an (optimal) partial order reduction algorithm to construct *RTS* <sup>k</sup>(S) in [43] and show that it preserves our properties.

We have tested our tool on 20 examples taken from the literature, which are reported in Table 2. The table shows that the tool terminates virtually instantaneously on all examples. The table suggests that many systems are indeed <sup>k</sup>-mc and most can be easily adapted to validate bound independence. The last column refers to the gmc condition, a form of *synchronous* multiparty compatibility (smc) introduced in [39]. The examples marked with † have been slightly modified to make them csa that validate <sup>k</sup>-obi and ibi. For instance, we take only one of the possible interleavings between mixed actions to remove mixed states (taking send action before receive action to preserve safety), see [43].

We have assessed the scalability of our approach with automatically generated examples, which we report in Fig. 6. Each system considered in these benchmarks consists of <sup>2</sup><sup>m</sup> (directed) csa for some <sup>m</sup> <sup>ě</sup> <sup>1</sup> such that <sup>S</sup> " (Mpi )1ďiď2m, and each automaton Mpi is of the form (when i is *odd*):

Mpi : pipi<sup>+</sup>1!*a1* pipi<sup>+</sup>1!*an* pipi<sup>+</sup>1!*a1* pipi<sup>+</sup>1!*an* pi+1pi?*a1* pi+1pi?*an* pi+1pi?*a1* pi+1pi?*an* k times k times

Each Mpi sends k messages to participant pi+1, then receives k messages from pi+1. Each message is taken from an alphabet {*a<sup>1</sup>* ,..., *an*} (n ě 1). Mpi has the same structure when i is *even*, but interacts with pi−<sup>1</sup> instead. Observe that any system constructed in this way is <sup>k</sup>-mc for any <sup>k</sup> <sup>ě</sup> <sup>1</sup>, <sup>n</sup> <sup>ě</sup> <sup>1</sup>, and <sup>m</sup> <sup>ě</sup> <sup>1</sup>. The shape of these systems allows us to assess how our approach fares in the worst case, i.e., large number of paths in *RTS* <sup>k</sup>(S). Figure 6 gives the time taken for our tool to terminate (y axis) wrt. the number of transitions in *RTS* <sup>k</sup>(S) where <sup>k</sup> is the least natural number for which the system is <sup>k</sup>-mc. The plot on the left in Fig. 6 gives the timings when k is increasing (every increment from k " 2 to k " 100) with the other parameters fixed (n " 1 and m " 5). The middle plot gives the timings when m is increasing (every increment from m " 1 to m " 26) with k " 10 and n " 1. The right-hand side plot gives the timings when n is increasing (every increment from n " 1 to n " 10) with k " 2 and m " 1. The largest *RTS* <sup>k</sup>(S) on which we have tested our tool has 12222 states and 22220 transitions, and the verification took under 17 min.<sup>1</sup> Observe that partial order reduction mitigates the increasing size of the transition system on which <sup>k</sup>-mc is checked, e.g., these experiments show that parameters k and m have only a linear effect on the number of transitions (see horizontal distances between data points). However the number of transitions increases exponentially with n (since the number of paths in each automaton increases exponentially with n).

#### 6 Related Work

*Theory of communicating automata* Communicating automata were introduced, and shown to be Turing powerful, in the 1980s [10] and have since then been studied extensively, namely through their connection with message sequence charts (MSC) [46]. Several works achieved decidability results by using bag or lossy channels [1,2,13,14] or by restricting the topology of the network [36,57].

Existentially bounded communicating automata stand out because they preserve the fifo semantics of communicating automata, do not restrict the topology of the network, and include infinite state systems. Given a bound k and

<sup>1</sup> All the benchmarks in this paper were run on an 8-core Intel i7-7700 machine with 16 GB RAM running a 64-bit Linux.

Fig. 6. Benchmarks: increasing k (left), increasing m (middle), and increasing n (right).

an arbitrary system of (deterministic) communicating automata S, it is generally *undecidable* whether S is existentially k-bounded. However, the question becomes decidable (pspace-complete) when <sup>S</sup> has the stable property. The stable property is itself generally *undecidable* (it is called deadlock-freedom in [22,35]). Hence this class is *not* directly applicable to the verification of message passing programs since its membership is overall undecidable. We have shown that <sup>k</sup>-obi, ibi, and <sup>k</sup>-exhaustive csa systems are (strictly) included in the class of existentially bounded systems. Hence, our work gives a sound *practical* procedure to check whether csa are existentially <sup>k</sup>-bounded. To the best of our knowledge, the only tools dedicated to the verification of (unbounded) communicating automata are McScM [26] and Chorgram [40]. Bouajjani et al. [9] study a variation of communicating automata with *mailboxes* (one input queue per automaton). They introduce the class of synchronisable systems and a procedure to check whether a system is k-synchronisable; it relies on executions consisting of k-bounded exchange phases. Given a system and a bound k, it is decidable (pspace-complete) whether its executions are equivalent to <sup>k</sup>-synchronous executions. Section 4.3 states that any k-synchronisable system which satisfies eventual reception is also k-exhaustive, see Theorem 7. In contrast to existential boundedness, synchronisability does not include all finite-state systems. Our characterisation result, based on local bound-agnosticity (Theorem 3), is *unique* to k-exhaustivity. It does not apply to existential boundedness nor synchronisability, see, e.g., Example 7. The term "synchronizability" is used by Basu et al. [3,4] to refer to another verification procedure for communicating automata with mailboxes. Finkel and Lozes [19] have shown that this notion of synchronizability is undecidable. We note that a system that is safe with a point-to-point semantics, may not be safe with a mailbox semantics (due to independent send actions), and vice-versa. For instance, the system in Fig. 2 is safe when executed with mailbox semantics.

*Multiparty Compatibility and Programming Languages.* The first definition of multiparty compatibility appeared in [18, Definition 4.2], inspired by the work in [23], to characterise the relationship between global types and communicating automata. This definition was later adapted to the setting of communicating timed automata in [6]. Lange et al. [39] introduced a generalised version of multiparty compatibility (gmc) to support communicating automata that feature mixed or non-directed states. Because our results apply to automata without mixed states, <sup>k</sup>-mc is not a strict extension of gmc, and gmc is not a strict extension of <sup>k</sup>-mc either, as it requires the existence of *synchronous* executions. In future work, we plan to develop an algorithm to synthesise representative choreographies from <sup>k</sup>-mc systems, using the algorithm in [39].

The notion of multiparty compatibility is at the core of recent works that apply session types techniques to programming languages. Multiparty compatibility is used in [51] to detect deadlocks in Go programs, and in [30] to study the well-formedness of Scribble protocols [64] through the compatibility of their projections. These protocols are used to generate various endpoint APIs that implement a Scribble specification [30,31,48], and to produce runtime monitoring tools [47,49,50]. Taylor et al. [67] use multiparty compatibility and choreography synthesis [39] to automate the analysis of the gen\_server library of Erlang/OTP. We can transparently widen the set of safe programs captured by these tools by using <sup>k</sup>-mc instead of synchronous multiparty compatibility (smc). The <sup>k</sup>-mc condition corresponds to a much wider instance of the *abstract* safety invariant <sup>ϕ</sup> for session types defined in [63]. Indeed <sup>k</sup>-mc includes smc (see [43]) and all finite-state systems (for k sufficiently large).

#### 7 Conclusions

We have studied csa via a new condition called <sup>k</sup>-exhaustivity. The <sup>k</sup>exhaustivity condition is (i) the basis for a wider notion of multiparty compatibility, <sup>k</sup>-mc, which captures asynchronous interactions and (ii) the first practical, empirically validated, sufficient condition for existential k-boundedness. We have shown that k-exhaustive systems are fully characterised by local boundagnosticity (each automaton behaves equivalently for any bound greater than or equal to k). This is a key requirement for asynchronous message passing programming languages where the possibility of having infinitely many orphan messages is undesirable, in particular for Go and Rust which provide *bounded* communication channels.

For future work, we plan to extend our theory beyond csa. We believe that it is possible to support mixed states and states which do not satisfy ibi, as long as their outgoing transitions are independent (i.e., if they commute). Additionally, to make <sup>k</sup>-mc checking more efficient, we will elaborate heuristics to find optimal bounds and off-load the verification of <sup>k</sup>-mc to an off-the-shelf model checker.

Acknowledgements. We thank Laura Bocchi and Alceste Scalas for their comments, and David Castro and Nicolas Dilley for testing the artifact. This work is partially supported by EPSRC EP/K034413/1, EP/K011715/1, EP/L00058X/1, EP/N027833/1, and EP/N028201/1.

#### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### Security and Hyperproperties

### **Verifying Hyperliveness**

Norine Coenen1(B), Bernd Finkbeiner<sup>1</sup>, C´esar S´anchez<sup>2</sup>, and Leander Tentrup<sup>1</sup>

<sup>1</sup> Reactive Systems Group, Saarland University, Saarbr¨ucken, Germany coenen@react.uni-saarland.de <sup>2</sup> IMDEA Software Institute, Madrid, Spain

**Abstract.** HyperLTL is an extension of linear-time temporal logic for the specification of hyperproperties, i.e., temporal properties that relate multiple computation traces. HyperLTL can express information flow policies as well as properties like symmetry in mutual exclusion algorithms or Hamming distances in error-resistant transmission protocols. Previous work on HyperLTL model checking has focussed on the alternation-free fragment of HyperLTL, where verification reduces to checking a standard trace property over an appropriate self-composition of the system. The alternation-free fragment does, however, not cover general hyperliveness properties. Universal formulas, for example, cannot express the secrecy requirement that for every possible value of a secret variable there exists a computation where the value is different while the observations made by the external observer are the same. In this paper, we study the more difficult case of hyperliveness properties expressed as HyperLTL formulas with quantifier alternation. We reduce existential quantification to strategic choice and show that synthesis algorithms can be used to eliminate the existential quantifiers automatically. We furthermore show that this approach can be extended to reactive system synthesis, i.e., to automatically construct a reactive system that is guaranteed to satisfy a given HyperLTL formula.

#### **1 Introduction**

HyperLTL [6] is a temporal logic for *hyperproperties* [7], i.e., for properties that relate multiple computation traces. Hyperproperties cannot be expressed in standard linear-time temporal logic (LTL), because LTL can only express *trace properties*, i.e., properties that characterize the correctness of individual computations. Even branching-time temporal logics like CTL and CTL∗, which quantify

This work was partially supported by the German Research Foundation (DFG) as part of the Collaborative Research Center "Foundations of Perspicuous Software Systems" (TRR 248, 389792660), and by the European Research Council (ERC) Grant OSARES (No. 683300)., by Madrid Reg. Government project "S2018/TCS-4339 (BLOQUES-CM)", by EU H2020 project 731535 "Elastest" and by Spanish National Project "BOSCO (PGC2018-102210-B-100)".

over computation paths, cannot express hyperproperties, because quantifying over a second path automatically means that the subformula can no longer refer to the previously quantified path. HyperLTL addresses this limitation with quantifiers over trace variables, which allow the subformula to refer to all previously chosen traces. For example, *noninterference* [21] between a secret input h and a public output o can be specified in HyperLTL by requiring that all pairs of traces π and π that always have the same inputs except for h (i.e., all inputs in <sup>I</sup> \ {h} are equal on <sup>π</sup> and <sup>π</sup> ) also have the same output o at all times:

$$\forall \pi. \forall \pi'. \Box \left(\bigwedge\_{i \in I \backslash \{h\}} i\_{\pi} = i\_{\pi'}\right) \Rightarrow \Box (o\_{\pi} = o\_{\pi'})$$

This formula states that a change in the secret input h alone cannot cause any difference in the output o.

For certain properties of interest, the additional expressiveness of HyperLTL comes at no extra cost when considering the model checking problem. To check a property like noninterference, which only has universal trace quantifiers, one simply builds the self-composition of the system, which provides a separate copy of the state variables for each trace. Instead of quantifying over all pairs of traces, it then suffices to quantify over individual traces of the self-composed system, which can be done with standard LTL. Model checking universal formulas is NLOGSPACE-complete in the size of the system and PSPACE-complete in the size of the formula, which is precisely the same complexity as for LTL.

Universal HyperLTL formulas suffice to express hypersafety properties like noninterference, but not hyperliveness properties that require, in general, quantifier alternation. A prominent example is *generalized noninterference* (GNI) [27], which can be expressed as the following HyperLTL formula:

$$\forall \pi. \forall \pi'. \exists \pi''. \ \Box (h\_{\pi} = h\_{\pi'}) \ \land \ \Box (o\_{\pi'} = o\_{\pi''})$$

This formula requires that for every pair of traces π and π , there is a third trace π in the system that agrees with π on h and with π on o. The existence of an appropriate trace π ensures that in π and π , the value of o is not determined by the value of h. Generalized noninterference stipulates that low-security outputs may not be altered by the injection of high-security inputs, while permitting nondeterminism in the low-observable behavior. The existential quantifier is needed to allow this nondeterminism. GNI is a hyperliveness property [7] even though the underlying LTL formula is a safety property. The reason for that is that we can extend any set of traces that violates GNI into a set of traces that satisfies GNI, by adding, for each offending pair of traces π, π , an appropriate trace π.

Hyperliveness properties also play an important role in applications beyond security. For example, *robust cleanness* [9] specifies that significant differences in the output behavior are only permitted after significant differences in the input:

$$(\forall \pi. \forall \pi'. \exists \pi''. \Box \left(i\_{\pi'} = i\_{\pi''}\right) \land \left(\hat{d}(o\_{\pi}, o\_{\pi''}) \leq \kappa\_o \; \mathcal{W} \; \hat{d}(i\_{\pi}, i\_{\pi''}) > \kappa\_i\right)$$

The differences are measured by a distance function ˆ d and compared to constant thresholds κ<sup>i</sup> for the input and κ<sup>o</sup> for the output. The formula specifies the existence of a trace π that globally agrees with π on the input and where the difference in the output o between π and π is bounded by κo, unless the difference in the input i between π and π was greater than κi. Robust cleanness, thus, forbids unexpected jumps in the system behavior that are, for example, due to software doping, while allowing for behavioral differences due to nondeterminism.

With quantifier alternation, the model checking problem becomes much more difficult. Model checking HyperLTL formulas of the form <sup>∀</sup>∗∃∗ϕ, where <sup>ϕ</sup> is a quantifier-free formula, is PSPACE-complete in the size of the system and EXPSPACE-complete in the formula. The only known model checking algorithm replaces the existential quantifier with the negation of a universal quantifier over the negated subformula; but this requires a complementation of the system behavior, which is completely impractical for realistic systems.

In this paper, we present an alternative approach to the verification of hyperliveness properties. We view the model checking problem of a formula of the form <sup>∀</sup>π.∃π . ϕ as a game between the <sup>∀</sup>-player and the <sup>∃</sup>-player. While the <sup>∀</sup>-player moves through the state space of the system building trace <sup>π</sup>, the <sup>∃</sup>-player must match each move in a separate traversal of the state space resulting in a trace π such that the pair π, π satisfies ϕ. Clearly, the existence of a winning strategy for the <sup>∃</sup>-player implies that <sup>∀</sup>π.∃π . ϕ is satisfied. The converse is not necessarily true: Even if there always is a trace π that matches the universally chosen trace <sup>π</sup>, the <sup>∃</sup>-player may not be able to construct this trace, because she only knows about the choices made by the <sup>∀</sup>-player in the finite prefix of <sup>π</sup> that has occurred so far, and not the choices that will be made by the ∀-player in the infinite future. We address this problem by introducing *prophecy variables* into the system. Without changing the behavior of the system, the prophecy variables give the ∃-player the information about the future that is needed to make the right choice after seeing only the finite prefix. Such prophecy variables can be provided manually by the user of the model checker to provide a lookahead on future moves of the ∀-player.

This game-theoretic approach provides an opportunity for the user to reduce the complexity of the model checking problem: If the user provides a strategy for the ∃-player, then the problem reduces to the cheaper model checking problem for universal properties. We show that such strategies can also be constructed automatically using synthesis. Beyond model checking, the game-theoretic approach also provides a method for the synthesis of systems that satisfy a conjunction of hypersafety and hyperliveness properties. Here, we do not only synthesize the strategy, but also construct the system itself, i.e., the game graph on which the model checking game is played. While the synthesis from ∀<sup>∗</sup>∃<sup>∗</sup> hyperproperties is known to be undecidable in general, we show that the game-theoretic approach can naturally be integrated into bounded synthesis, which checks for the existence of a correct system up to a bound on the number of states.

**Related Work.** While the verification of general HyperLTL formulas has been studied before [6,17,18], there has been, so far, no practical model checking algorithm for HyperLTL formulas with quantifier alternation. The existing algorithm involves a complementation of the system automaton, which results in an exponential blow-up of the state space [18]. The only existing model checker for HyperLTL, MCHyper [18], was therefore, so far, limited to the alternationfree fragment. Although some hyperliveness properties lie in this fragment, quantifier alternation is needed to express general hyperliveness properties like GNI. In this paper, we present a technique to model check these hyperliveness properties and extend MCHyper to formulas with quantifier alternation.

The situation is similar in the area of reactive synthesis. There is a synthesis algorithm that automatically constructs implementations from HyperLTL specifications [13] using the bounded synthesis approach [20]. This algorithm is, however, also only applicable to the alternation-free fragment of HyperLTL. In this paper, we extend the bounded synthesis approach to HyperLTL formulas with quantifier alternation. Beyond the model checking and synthesis problems, the satisfiability [11,12,14] and monitoring [15,16,22] problems of HyperLTL have also been studied in the past.

For certain information-flow security policies, there are verification techniques that use methods related to our model checking and synthesis algorithms. Specifically, the self-composition technique [2,3], a construction based on the product of copies of a system, has been tailored for various trace-based security definitions [10,23,28]. Unlike our algorithms, these techniques focus on specific information-flow policies, not on a general logic like HyperLTL.

The use of prophecy variables [1] to make information about the future accessible is a known technique in the verification of trace properties. It is, for example, used to establish simulation relations between automata [26] or in the verification of CTL<sup>∗</sup> properties [8].

In our game-theoretic view on the model checking problem for ∀<sup>∗</sup>∃<sup>∗</sup> hyperproperties the ∃-player has an infinite lookahead. There is some work on *finite* lookahead on trace languages [24]. We use the idea of finite lookahead as an approximation to construct existential strategies and give a novel synthesis construction for strategies with delay based on bounded synthesis [20].

#### **2 Preliminaries**

For tuples *<sup>x</sup>* <sup>∈</sup> <sup>X</sup><sup>n</sup> and *<sup>y</sup>* <sup>∈</sup> <sup>X</sup><sup>m</sup> over set <sup>X</sup>, we use *<sup>x</sup>* · *<sup>y</sup>* <sup>∈</sup> <sup>X</sup>n+<sup>m</sup> to denote the concatenation of *<sup>x</sup>* and *<sup>y</sup>*. Given a function <sup>f</sup> : <sup>X</sup> <sup>→</sup> <sup>Y</sup> and a tuple *<sup>x</sup>* <sup>∈</sup> <sup>X</sup><sup>n</sup>, we define by <sup>f</sup> ◦ *<sup>x</sup>* <sup>∈</sup> <sup>Y</sup> <sup>n</sup> the tuple (f(*x*[1]),...,f(*x*[n])). Let AP be a finite set of atomic propositions and let Σ = 2AP be the corresponding alphabet. A *trace* <sup>t</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup> is an infinite sequence of elements of <sup>Σ</sup>. We denote a set of traces by *Tr* <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup>. We define <sup>t</sup>[i,∞] to be the suffix of <sup>t</sup> starting at position <sup>i</sup> <sup>≥</sup> 0.

*HyperLTL.* HyperLTL [6] is a temporal logic for specifying hyperproperties. It extends LTL by quantification over trace variables π and a method to link atomic propositions to specific traces. Let V be an infinite set of trace variables. Formulas in HyperLTL are given by the grammar

<sup>ϕ</sup> ::= <sup>∀</sup>π.ϕ | ∃π.ϕ <sup>|</sup> ψ , and <sup>ψ</sup> ::= <sup>a</sup><sup>π</sup> | ¬<sup>ψ</sup> <sup>|</sup> <sup>ψ</sup> <sup>∨</sup> <sup>ψ</sup> <sup>|</sup> <sup>ψ</sup> <sup>|</sup> <sup>ψ</sup> <sup>U</sup> ψ ,

where <sup>a</sup> <sup>∈</sup> AP and <sup>π</sup> ∈ V. We allow the standard boolean connectives <sup>∧</sup>, <sup>→</sup>, <sup>↔</sup> as well as the derived LTL operators release <sup>ϕ</sup> <sup>R</sup> <sup>ψ</sup> ≡ ¬(¬<sup>ϕ</sup> U ¬ψ), eventually <sup>ϕ</sup> <sup>≡</sup> *true* <sup>U</sup> <sup>ϕ</sup>, globally <sup>ϕ</sup> ≡ ¬ <sup>¬</sup>ϕ, and weak until <sup>ϕ</sup><sup>W</sup> <sup>ψ</sup> <sup>≡</sup> <sup>ϕ</sup>∨(<sup>ϕ</sup> <sup>U</sup> <sup>ψ</sup>).

We call a <sup>Q</sup><sup>+</sup>Q<sup>+</sup><sup>ϕ</sup> HyperLTL formula (for <sup>Q</sup>, <sup>Q</sup> ∈ {∀, ∃} and quantifier-free formula <sup>ϕ</sup>) *alternation-free* iff <sup>Q</sup> <sup>=</sup> <sup>Q</sup> . Further, we say that <sup>Q</sup><sup>+</sup>Q<sup>+</sup><sup>ϕ</sup> has *one quantifier alternation* (or lies in the *one-alternation fragment*) iff Q = Q .

The semantics of HyperLTL is given by the satisfaction relation -*Tr* over a set of traces *Tr* <sup>⊆</sup> <sup>Σ</sup>ω. We define an assignment <sup>Π</sup> : V → <sup>Σ</sup><sup>ω</sup> that maps trace variables to traces. <sup>Π</sup>[<sup>π</sup> → <sup>t</sup>] updates <sup>Π</sup> by assigning variable <sup>π</sup> to trace <sup>t</sup>.

Π, i -*Tr* <sup>a</sup><sup>π</sup> iff <sup>a</sup> <sup>∈</sup> <sup>Π</sup>(π)[i] Π, i -*Tr* <sup>¬</sup><sup>ϕ</sup> iff Π, i -*Tr* ϕ Π, i -*Tr* <sup>ϕ</sup> <sup>∨</sup> <sup>ψ</sup> iff Π, i -*Tr* ϕ or Π, i -*Tr* ψ Π, i -*Tr* ϕ iff Π, i + 1 -*Tr* ϕ Π, i -*Tr* <sup>ϕ</sup> <sup>U</sup> <sup>ψ</sup> iff <sup>∃</sup><sup>j</sup> <sup>≥</sup> i. Π, j -*Tr* <sup>ψ</sup> ∧ ∀<sup>i</sup> <sup>≤</sup> k < j. Π, k -*Tr* ϕ Π, i -*Tr* <sup>∃</sup>π.ϕ iff there is some <sup>t</sup> <sup>∈</sup> *Tr* such that <sup>Π</sup>[<sup>π</sup> → <sup>t</sup>], i -*Tr* ϕ Π, i -*Tr* <sup>∀</sup>π.ϕ iff for all <sup>t</sup> <sup>∈</sup> *Tr* it holds that <sup>Π</sup>[<sup>π</sup> → <sup>t</sup>], i -*Tr* ϕ

We write *Tr* <sup>ϕ</sup> for {}, <sup>0</sup> -*Tr* <sup>ϕ</sup> where {} denotes the empty assignment.

Every hyperproperty is an intersection of a hypersafety and a hyperliveness property [7]. A *hypersafety* property is one where there is a finite set of finite traces that is a bad prefix, i.e., that cannot be extended into a set of traces that satisfies the hypersafety property. A *hyperliveness* property is a property where every finite set of finite traces can be extended to a possibly infinite set of infinite traces such that the resulting trace set satisfies the hyperliveness property.

*Transition Systems.* We use transition systems as a model of computation for reactive systems. Transition systems consume sequences over an input alphabet by transforming their internal state in every step. Let I and O be a finite set of input and output propositions, respectively, and let Υ = 2<sup>I</sup> and Γ = 2<sup>O</sup> be the corresponding finite alphabets. A <sup>Γ</sup>-labeled <sup>Υ</sup>-*transition system* <sup>S</sup> is a tuple S, s0,τ,l, where <sup>S</sup> is a finite set of states, <sup>s</sup><sup>0</sup> <sup>∈</sup> <sup>S</sup> is the designated initial state, <sup>τ</sup> : <sup>S</sup> <sup>×</sup><sup>Υ</sup> <sup>→</sup> <sup>S</sup> is the transition function, and <sup>l</sup>: <sup>S</sup> <sup>→</sup> <sup>Γ</sup> is the state-labeling function. We write s <sup>υ</sup> −→ <sup>s</sup> or (s, υ, s ) <sup>∈</sup> <sup>τ</sup> if <sup>τ</sup> (s, υ) = <sup>s</sup> . We generalize the transition function to sequences over <sup>Υ</sup> by defining <sup>τ</sup> <sup>∗</sup> : <sup>Υ</sup> <sup>∗</sup> <sup>→</sup> <sup>S</sup> recursively as <sup>τ</sup> <sup>∗</sup>() = <sup>s</sup><sup>0</sup> and <sup>τ</sup> <sup>∗</sup>(υ<sup>0</sup> ··· <sup>υ</sup><sup>n</sup>−<sup>1</sup>υn) = <sup>τ</sup> (<sup>τ</sup> <sup>∗</sup>(υ<sup>0</sup> ··· <sup>υ</sup><sup>n</sup>−<sup>1</sup>), υn) for <sup>υ</sup><sup>0</sup> ··· <sup>υ</sup><sup>n</sup>−<sup>1</sup>υ<sup>n</sup> <sup>∈</sup> <sup>Υ</sup> <sup>+</sup>. Given an infinite word <sup>υ</sup> <sup>=</sup> <sup>υ</sup>0υ<sup>1</sup> ... <sup>∈</sup> <sup>Υ</sup> <sup>ω</sup>, the transition system produces an infinite sequence of outputs <sup>γ</sup> <sup>=</sup> <sup>γ</sup>0γ1γ<sup>2</sup> ... <sup>∈</sup> <sup>Γ</sup> <sup>ω</sup>, such that <sup>γ</sup><sup>i</sup> <sup>=</sup> <sup>l</sup>(<sup>τ</sup> <sup>∗</sup>(υ<sup>0</sup> ...υ<sup>i</sup>−<sup>1</sup>)) for every <sup>i</sup> <sup>≥</sup> 0. The resulting *trace* <sup>ρ</sup> is (υ<sup>0</sup> <sup>∪</sup> <sup>γ</sup>0)(υ<sup>1</sup> <sup>∪</sup> <sup>γ</sup>1)... <sup>∈</sup> <sup>Σ</sup><sup>ω</sup> where we have AP <sup>=</sup> <sup>I</sup> <sup>∪</sup> <sup>O</sup>. The set of traces generated by <sup>S</sup> is denoted by *traces*(S). Furthermore, we define <sup>ε</sup> <sup>=</sup> {s}, s, τε, lε as the transition system over <sup>I</sup> <sup>=</sup> <sup>O</sup> <sup>=</sup> <sup>∅</sup> that has only a single trace, that is *traces*(ε) = {∅<sup>ω</sup>}. For this transition system, <sup>τ</sup>ε(s, <sup>∅</sup>) = <sup>s</sup> and <sup>l</sup>ε(s) = <sup>∅</sup>. Given two transition systems <sup>S</sup> <sup>=</sup> S, s0,τ,l and <sup>S</sup> <sup>=</sup> S , s <sup>0</sup>, τ , l , we define S×S <sup>=</sup> <sup>S</sup> <sup>×</sup> <sup>S</sup> ,(s0, s <sup>0</sup>), τ , l as the <sup>Γ</sup><sup>2</sup> labeled Υ<sup>2</sup>-transition system where τ ((s, s ),(υ, υ )) = (τ (s, υ), τ (s , υ )) and l ((s, s )) = (l(s), l (s )). A transition system S satisfies a general HyperLTL formula <sup>ϕ</sup>, if, and only if, *traces*(S) ϕ.

*Automata.* An alternating parity automaton <sup>A</sup> over a finite alphabet <sup>Σ</sup> is a tuple Q, q0, δ, α, where <sup>Q</sup> is a finite set of states, <sup>q</sup><sup>0</sup> <sup>∈</sup> <sup>Q</sup> is the designated initial state, <sup>δ</sup> : <sup>Q</sup> <sup>×</sup> <sup>Σ</sup> <sup>→</sup> <sup>B</sup><sup>+</sup>(Q) is the transition function, and <sup>α</sup>: <sup>Q</sup> <sup>→</sup> <sup>C</sup> is a function that maps states of <sup>A</sup> to a finite set of colors <sup>C</sup> <sup>⊂</sup> <sup>N</sup>. For <sup>C</sup> <sup>=</sup> {0, <sup>1</sup>} and <sup>C</sup> <sup>=</sup> {1, <sup>2</sup>}, we call A a co-B¨uchi and B¨uchi automaton, respectively, and we use the sets <sup>F</sup> <sup>⊆</sup> <sup>Q</sup> and <sup>B</sup> <sup>⊆</sup> <sup>Q</sup> to represent the rejecting (<sup>C</sup> = 1) and accepting (<sup>C</sup> = 2) states in the respective automaton (as a replacement of the coloring function α). A safety automaton is a B¨uchi automaton where every state is accepting. The transition function <sup>δ</sup> maps a state <sup>q</sup> <sup>∈</sup> <sup>Q</sup> and some <sup>a</sup> <sup>∈</sup> <sup>Σ</sup> to a positive Boolean combination of successor states δ(q, a). An automaton is *non-deterministic* or *universal* if δ is purely disjunctive or conjunctive, respectively.

A run of an alternating automaton is a Q-labeled tree. A tree T is a subset of N<sup>∗</sup> <sup>&</sup>gt;<sup>0</sup> such that for every node <sup>n</sup> <sup>∈</sup> <sup>N</sup><sup>∗</sup> <sup>&</sup>gt;<sup>0</sup> and every positive integer <sup>i</sup> <sup>∈</sup> <sup>N</sup>><sup>0</sup>, if <sup>n</sup> · <sup>i</sup> <sup>∈</sup> <sup>T</sup> then (i) <sup>n</sup> <sup>∈</sup> <sup>T</sup> (i.e., <sup>T</sup> is prefix-closed), and (ii) for every 0 <j<i, <sup>n</sup> · <sup>j</sup> <sup>∈</sup> <sup>T</sup>. The root of <sup>T</sup> is the empty sequence  and for a node <sup>n</sup> <sup>∈</sup> <sup>T</sup>, <sup>|</sup>n<sup>|</sup> is the length of the sequence n, in other words, its distance from the root. A run of <sup>A</sup> on an infinite word <sup>ρ</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup> is a <sup>Q</sup>-labeled tree (T,r) such that <sup>r</sup>() = <sup>q</sup><sup>0</sup> and for every node <sup>n</sup> <sup>∈</sup> <sup>T</sup> with children <sup>n</sup>1,...,n<sup>k</sup> the following holds: <sup>1</sup> <sup>≤</sup> <sup>k</sup> ≤ |Q<sup>|</sup> and {r(n1),...,r(nk)} <sup>δ</sup>(q, ρ[i]), where <sup>q</sup> <sup>=</sup> <sup>r</sup>(n) and <sup>i</sup> <sup>=</sup> <sup>|</sup>n|. A path is accepting if the highest color appearing infinitely often is even. A run is accepting if all its paths are accepting. The language of A, written L(A), is the set {<sup>ρ</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup> | A accepts <sup>ρ</sup>}. A transition system <sup>S</sup> is accepted by an automaton A, written S -A, if *traces*(S) ⊆ L(A).

*Strategies.* Given two disjoint finite alphabets <sup>Υ</sup> and <sup>Γ</sup>, a strategy <sup>σ</sup> : <sup>Υ</sup> <sup>∗</sup> <sup>→</sup> <sup>Γ</sup> is a mapping from finite histories of <sup>Υ</sup> to <sup>Γ</sup>. A transition system <sup>S</sup> <sup>=</sup> S, s0,τ,l *generates* the strategy <sup>σ</sup> if <sup>σ</sup>(*υ*) = <sup>l</sup>(<sup>τ</sup> <sup>∗</sup>(*υ*)) for every *<sup>υ</sup>* <sup>∈</sup> <sup>Υ</sup> <sup>∗</sup>. A strategy <sup>σ</sup> is called *finite-state* if there exists a transition system that generates σ.

In the following, we use finite-state strategies to modify the inputs of transition systems. Let <sup>S</sup> <sup>=</sup> S, s0,τ,l be a transition system over input and output alphabets Υ and Γ and let σ : (Υ )<sup>∗</sup> <sup>→</sup> <sup>Υ</sup> be a finite-state strategy. Let <sup>S</sup> <sup>=</sup> S , s <sup>0</sup>, τ , l be the transition system implementing <sup>σ</sup>, then S || <sup>σ</sup> <sup>=</sup> S || S is the transition system S×S ,(s0, s <sup>0</sup>), τ ||, l|| where <sup>τ</sup> || : (S×S )×Υ <sup>→</sup> (S×S ) is defined as τ ||((s, s ), υ )=(τ (s, l (s )), τ (s , υ )) and l || : (<sup>S</sup> <sup>×</sup> <sup>S</sup> ) <sup>→</sup> <sup>Γ</sup> is defined as l ||(s, s ) = <sup>l</sup>(s) for every <sup>s</sup> <sup>∈</sup> <sup>S</sup>, <sup>s</sup> <sup>∈</sup> <sup>S</sup> , and <sup>υ</sup> <sup>∈</sup> <sup>Υ</sup> .

*Model Checking HyperLTL.* We recap the model checking of universal Hyper-LTL formulas. This case, as well as the dual case of only existential quantifiers, is well-understood and, in fact, efficiently implemented in the model checker MCHyper [18]. The principle behind the model checking approach is *selfcomposition*, where we check a standard trace property on a composition of an appropriate number of copies of the given system.

Let *zip* denote the function that maps an n-tuple of sequences to a single sequence of n-tuples, for example, *zip*([1, 2, 3], [4, 5, 6]) = [(1, 4),(2, 5),(3, 6)], and let *unzip* denote its inverse. Given <sup>S</sup> <sup>=</sup> S, s0,τ,l, the <sup>n</sup>-fold self-composition of <sup>S</sup> is the transition system <sup>S</sup><sup>n</sup> <sup>=</sup> Sn, *<sup>s</sup>-* **<sup>0</sup>**, τn, ln, where *<sup>s</sup>-* **<sup>0</sup>** := (s0,...,s0) <sup>∈</sup> <sup>S</sup><sup>n</sup>, <sup>τ</sup>n(*s*, *<sup>υ</sup>*) := <sup>τ</sup> ◦*zip*(*s*, *<sup>υ</sup>*) and <sup>l</sup>n(*s*) := <sup>l</sup>◦*<sup>s</sup>* for every *<sup>s</sup>* <sup>∈</sup> <sup>S</sup><sup>n</sup> and *<sup>υ</sup>* <sup>∈</sup> <sup>Υ</sup> <sup>n</sup>. If *traces*(S) is the set of traces generated by <sup>S</sup>, then {*zip*(ρ1,...,ρn) <sup>|</sup> <sup>ρ</sup>1,...,ρ<sup>n</sup> <sup>∈</sup> *traces*(S)} is the set of traces generated by <sup>S</sup>n. We use the notation *zip*(ϕ, π1, π2,...,πn) for some HyperLTL formula ϕ to combine the trace variables π1, π2,...,π<sup>n</sup> (occurring free in ϕ) into a fresh trace variable π∗.

**Theorem 1 (Self-composition for universal HyperLTL formulas** [18]**).** *For a transition system* <sup>S</sup> *and a HyperLTL formula of the form* <sup>∀</sup>π1. <sup>∀</sup>π2.... <sup>∀</sup>πn. ϕ *it holds that* <sup>S</sup> - <sup>∀</sup>π1.∀π2.... <sup>∀</sup>πn. ϕ *iff* <sup>S</sup><sup>n</sup> - <sup>∀</sup>π∗. *zip*(ϕ, π1, π2,...,πn)*.*

**Theorem 2 (Complexity of model checking universal formulas** [18]**).** *The model checking problem for universal HyperLTL formulas is PSPACEcomplete in the size of the formula and NLOGSPACE-complete in the size of the transition system.*

The complexity of verifying universal HyperLTL formulas is exactly the same as the complexity of verifying LTL formulas. For HyperLTL formulas with quantifier alternations, the model checking problem is significantly more difficult.

**Theorem 3 (Complexity of model checking formulas with one quantifier alternation** [18]**).** *The model checking problem for HyperLTL formulas with one quantifier alternation is in EXPSPACE in the size of the formula and in PSPACE in the size of the transition system.*

One way to circumvent this complexity is to fix the existential choice and strengthen the formula to the universal fragment [9,13,18]. While avoiding the complexity problem, this transformation requires deep knowledge of the system, is prone to errors, and cannot be verified automatically as the problem of checking implications becomes undecidable [11]. In the following section, we present a technique that circumvents the complexity problem while still inheriting strong correctness guarantees. Further, we provide a method that can, under certain restrictions, derive a strategy for the existential choice automatically.

#### **3 Model Checking with Quantifier Alternations**

#### **3.1 Model Checking with Given Strategies**

Our first goal is the verification of HyperLTL formulas with one quantifier alternation, i.e., formulas of the form <sup>∀</sup><sup>∗</sup>∃<sup>∗</sup><sup>ϕ</sup> or <sup>∃</sup><sup>∗</sup>∀<sup>∗</sup>ϕ, where <sup>ϕ</sup> is a quantifier-free formula. Note that the presented techniques can, similar to skolemization, be extended to more than one quantifier alternation. Quantifier alternation introduces dependencies between the quantified traces. In a <sup>∀</sup><sup>∗</sup>∃<sup>∗</sup><sup>ϕ</sup> formula, the choices of the existential quantifiers depend on the choices of the universal quantifiers preceding them. In a formula of the form <sup>∃</sup><sup>∗</sup>∀<sup>∗</sup>ϕ, however, there has to be a single choice for the existential quantifiers that works for all choices of the universal quantifiers. In this case, the existentially quantified variables do not depend on the universally quantified variables. Hence, the witnesses for the existential quantifiers are traces rather than functions that map tuples of traces to traces. As established above, the model checking problem for HyperLTL formulas with quantifier alternation is known to be significantly more difficult than the model checking problem for universal formulas.

Our verification technique for formulas with quantifier alternation is to substitute strategic choice for existential choice. As discussed in the introduction, the existence of a strategy implies the existence of a trace.

#### **Theorem 4 (Substituting Strategic Choice for Existential Choice).** *Let* <sup>S</sup> *be a transition system over input alphabet* <sup>Υ</sup>*.*

*It holds that* S - <sup>∀</sup>π1∀π<sup>2</sup> ... <sup>∀</sup>πn. <sup>∃</sup>π 1∃π <sup>2</sup> ... <sup>∃</sup>π <sup>m</sup>. ϕ *if there is a strategy* σ : (<sup>Υ</sup> <sup>n</sup>)<sup>∗</sup> <sup>→</sup> <sup>Υ</sup> <sup>m</sup> *such that* <sup>S</sup><sup>n</sup> <sup>×</sup>(S<sup>m</sup> || <sup>σ</sup>) - <sup>∀</sup>π∗.*zip*(ϕ, π1, π2,...πn, π 1, π 2,...,π <sup>m</sup>)*. It holds that* S - <sup>∃</sup>π1∃π<sup>2</sup> ... <sup>∃</sup>πm. <sup>∀</sup>π 1∀π <sup>2</sup> ... <sup>∀</sup>π <sup>n</sup>. ϕ *if there is a strategy* σ : (Υ<sup>0</sup>)<sup>∗</sup> <sup>→</sup> <sup>Υ</sup> <sup>m</sup> *such that* (S<sup>m</sup> || <sup>σ</sup>)× S<sup>n</sup> - <sup>∀</sup>π∗.*zip*(ϕ, π1, π2,...πm, π 1, π 2,...,π <sup>n</sup>)*.*

*Proof.* Let σ be such a strategy, then we define a witness for the existential trace quantifiers <sup>∃</sup>π 1∃π <sup>2</sup> ... <sup>∃</sup>π <sup>m</sup> as the sequence of inputs <sup>υ</sup> <sup>=</sup> <sup>υ</sup>0υ<sup>1</sup> ... <sup>∈</sup> (<sup>Υ</sup> <sup>m</sup>)<sup>ω</sup> such that υ<sup>i</sup> = σ(υ 0υ <sup>1</sup> ...υ <sup>i</sup>−<sup>1</sup>) for every <sup>i</sup> <sup>≥</sup> 0 and every <sup>υ</sup> <sup>i</sup> <sup>∈</sup> <sup>Υ</sup> <sup>n</sup>; analogously, we define a witness for the existential trace quantifiers <sup>∃</sup>π1∃π<sup>2</sup> ... <sup>∃</sup>π<sup>m</sup> as the sequence of inputs <sup>υ</sup> <sup>=</sup> <sup>υ</sup>0υ<sup>1</sup> ... <sup>∈</sup> (<sup>Υ</sup> <sup>m</sup>)<sup>ω</sup> such that <sup>υ</sup><sup>i</sup> <sup>=</sup> <sup>σ</sup>(υ 0υ <sup>1</sup> ...υ <sup>i</sup>−<sup>1</sup>) for every <sup>i</sup> <sup>≥</sup> 0 and every <sup>υ</sup> <sup>i</sup> <sup>∈</sup> <sup>Υ</sup><sup>0</sup>.

An application of the theorem reduces the verification problem of a HyperLTL formula with one quantifier alternation to the verification problem of a universal HyperLTL formula. If a sufficiently small strategy can be found, the reduction in complexity is substantial:

**Corollary 1 (Model checking with Given Strategies).** *The model checking problem for HyperLTL formulas with one quantifier alternation and given strategies for the existential quantifiers is in PSPACE in the size of the formula and NLOGSPACE in the size of the product of the strategy and the system.*

Note that the converse of Theorem 4 is not in general true. The satisfaction of a ∀<sup>∗</sup>∃<sup>∗</sup> HyperLTL formula does not imply the existence of a strategy, because at any given point in time the strategy only knows about a finite prefix of the universally quantified traces. Consider the formula <sup>∀</sup>π∃π . <sup>a</sup><sup>π</sup> <sup>↔</sup> <sup>a</sup><sup>π</sup> and a system that can produce arbitrary sequences of <sup>a</sup> and <sup>¬</sup>a. Although the system satisfies the formula, it is not possible to give a strategy that allows us to prove this fact. Whatever choice our strategy makes, the next move of the ∀-player can make sure that the strategy's choice was wrong. In the following, we present a method that addresses this problem.

**Prophecy Variables.** A classic technique for resolving future dependencies is the introduction of *prophecy variables* [1]. Prophecy variables are auxiliary variables that are added to the system without affecting the behavior of the system. Such variables can be used to make predictions about the future.

We use prophecy variables to define strategies that depend on the future. In the example discussed above, <sup>∀</sup>π∃π . <sup>a</sup><sup>π</sup> <sup>↔</sup> <sup>a</sup><sup>π</sup>- , the choice of the value of a<sup>π</sup>in the first position depends on the value of a<sup>π</sup> in the second position. We introduce a prophecy variable p that predicts in the first position whether a<sup>π</sup> is true in the second position. With the prophecy variable, there exists a strategy that correctly assigns the value of p whenever the prediction is correct: The strategy chooses to set aπif, and only if, p holds.

Technically, the proof technique introduces a set of fresh input variables P into the system. For a <sup>Γ</sup>-labeled <sup>Υ</sup>-transition system <sup>S</sup> <sup>=</sup> S, s0,τ,l, we define the <sup>Γ</sup>-labeled (<sup>Υ</sup> <sup>∪</sup> <sup>P</sup>)-transition system <sup>S</sup><sup>P</sup> <sup>=</sup> S, s0, τ <sup>P</sup> , l including the inputs <sup>P</sup> where <sup>τ</sup> <sup>P</sup> : <sup>S</sup> <sup>×</sup>(<sup>Υ</sup> <sup>∪</sup>P) <sup>→</sup> <sup>S</sup>. For all <sup>s</sup> <sup>∈</sup> <sup>S</sup> and <sup>υ</sup><sup>P</sup> <sup>∈</sup> <sup>Υ</sup> <sup>∪</sup>P, <sup>τ</sup> <sup>P</sup> (s, υ<sup>P</sup> ) = <sup>τ</sup> (s, υ) for <sup>υ</sup> <sup>∈</sup> <sup>Υ</sup> obtained by removing the variables in <sup>P</sup> from <sup>υ</sup><sup>P</sup> (i.e., <sup>υ</sup> <sup>=</sup>\<sup>P</sup> <sup>υ</sup><sup>P</sup> ). Moreover, the proof technique modifies the specification so that the original property only needs to be satisfied if the prediction is actually correct. We obtain the modified specification <sup>∀</sup>π∃π .(p<sup>π</sup> <sup>↔</sup> <sup>a</sup>π) <sup>→</sup> ( <sup>a</sup><sup>π</sup> <sup>↔</sup> <sup>a</sup><sup>π</sup>- ) in our example. The following theorem describes the general technique for one prophecy variable.

**Theorem 5 (Model checking with Prophecy Variables).** *For a transition system* <sup>S</sup> *and a quantifier-free formula* <sup>ϕ</sup>*, let* <sup>ψ</sup> *be a quantifier-free formula over the universally quantified trace variables* π1, π<sup>2</sup> ...π<sup>n</sup> *and let* p *be a fresh atomic proposition. It holds that* S - <sup>∀</sup>π1∀π<sup>2</sup> ... <sup>∀</sup>πn. <sup>∃</sup>π 1∃π <sup>2</sup> ... <sup>∃</sup>π <sup>m</sup>. ϕ *if, and only if,* <sup>S</sup>{p} - <sup>∀</sup>π1∀π<sup>2</sup> ... <sup>∀</sup>πn. <sup>∃</sup>π 1∃π <sup>2</sup> ... <sup>∃</sup>π <sup>m</sup>. (p<sup>π</sup><sup>1</sup> <sup>↔</sup> <sup>ψ</sup>) <sup>→</sup> ϕ.

Note that ψ is restricted to refer only to *universally* quantified trace variables. Without this restriction, the method would not be sound. In our example, ψ = aπ would lead to the modified formula <sup>∀</sup>π∃π .(p<sup>π</sup> <sup>↔</sup> <sup>a</sup><sup>π</sup>- ) <sup>→</sup> ( <sup>a</sup><sup>π</sup> <sup>↔</sup> <sup>a</sup><sup>π</sup>- ), which could be satisfied with the strategy that assigns a<sup>π</sup> to *true* iff p<sup>π</sup> is *false*, and thus falsifies the assumption that the prediction is correct, rather than ensuring that the original formula is true.

*Proof.* It is easy to see that the original specification implies the modified specification, since the original formula is the conclusion of the implication. Assume that the modified specification holds. Since the prophecy variable p is a fresh atomic proposition, and ψ does not refer to the existentially chosen traces, we can, for every choice of the universally quantified traces, always choose the value of p such that it guesses correctly, i.e., that p is true whenever ψ holds. In this case, the conclusion and therefore the original specification must be true.

Unfortunately, prophecy variables do not provide a complete proof technique. Consider a system allowing arbitrary sequences of a and b and this specification:

$$\begin{aligned} \forall \pi \exists \pi'. b\_{\pi'} \land \Box (b\_{\pi'} \leftrightarrow \bigcirc \neg b\_{\pi'})\\ \land (a\_{\pi'} \to (a\_{\pi} \,\mathcal{W} \,(b\_{\pi'} \land \neg a\_{\pi})))\\ \land (\neg a\_{\pi'} \to (a\_{\pi} \,\mathcal{W} \,(\neg b\_{\pi'} \land \neg a\_{\pi}))) \end{aligned}$$

Intuitively, π has to be able to predict whether π will stop outputting a at an even or odd position of the trace. There is no HyperLTL formula to be used as ψ in Theorem 5, because, like LTL, HyperLTL can only express noncounting properties. It is worth noting that in our practical experiments, the incompleteness was never a problem. In many cases, it is not even necessary to add prophecy variables at all. The presented proof technique is, thus, practically useful despite this incompleteness result.

#### **3.2 Model Checking with Synthesized Strategies**

We now extend the model checking approach with the automatic synthesis of the strategies for the existential quantifiers. For a given HyperLTL formula of the form <sup>∀</sup><sup>n</sup>∃<sup>m</sup><sup>ϕ</sup> and a transition system <sup>S</sup>, we search for a transition system <sup>S</sup><sup>∃</sup> <sup>=</sup> X, x0, μ, l∃, where <sup>X</sup> is a set of states, <sup>x</sup><sup>0</sup> <sup>∈</sup> <sup>X</sup> is the designated initial state, <sup>μ</sup>: <sup>X</sup> <sup>×</sup><sup>Υ</sup> <sup>n</sup> <sup>→</sup> <sup>X</sup> is the transition function, and <sup>l</sup><sup>∃</sup> : <sup>X</sup> <sup>→</sup> <sup>Υ</sup> <sup>m</sup> is the labeling function, such that <sup>S</sup><sup>n</sup> <sup>×</sup> (S<sup>m</sup> || S∃) *zip*(ϕ). (Since for formulas of the form <sup>∃</sup><sup>m</sup>∀<sup>n</sup><sup>ϕ</sup> the problem only differs in the input of <sup>S</sup>∃, we focus on ∀∃ HyperLTL.)

**Theorem 6.** *The strategy realizability problem for* <sup>∀</sup><sup>∗</sup>∃<sup>∗</sup> *formulas is* 2ExpTime*complete.*

*Proof (Sketch).* We reduce the strategy synthesis problem to the problem of synthesizing a distributed reactive system with a single black-box process. This problem is decidable [19] and can be solved in 2ExpTime. The lower bound follows from the LTL realizability problem [30].

The decidability result implies that there is an upper bound on the size of <sup>S</sup><sup>∃</sup> that is doubly exponential in <sup>ϕ</sup>. Thus, the bounded synthesis approach [20] can be used to search for increasingly larger implementations, until a solution is found or the maximal bound is reached, yielding an efficient decision procedure for the strategy synthesis problem. In the following, we describe this approach in detail.

**Bounded Synthesis of Strategies.** We transform the synthesis problem into an SMT constraint satisfaction problem, where we leave the representation of strategies uninterpreted and challenge the solver to provide an interpretation. Given a HyperLTL formula <sup>∀</sup><sup>n</sup>∃<sup>m</sup><sup>ϕ</sup> where <sup>ϕ</sup> is quantifier-free, the model checking is based on the product of the n-fold self composition of the transition system <sup>S</sup>, the <sup>m</sup>-fold self-composition of <sup>S</sup> where the strategy <sup>S</sup><sup>∃</sup> controls the inputs, and the universal co-B¨uchi automaton <sup>A</sup><sup>ϕ</sup> representing the language <sup>L</sup>(ϕ) of <sup>ϕ</sup>.

For a quantifier-free HyperLTL formula ϕ, we construct the universal co-B¨uchi automaton <sup>A</sup><sup>ϕ</sup> such that <sup>L</sup>(Aϕ) is the set of words <sup>w</sup> such that *unzip*(w) - ϕ, i.e., the tuple of traces satisfies ϕ. We get this automaton by dualizing the non-deterministic B¨uchi automaton for <sup>¬</sup><sup>ψ</sup> [6], i.e., changing the branching from non-deterministic to universal and the acceptance condition from B¨uchi to co-B¨uchi. Hence, <sup>S</sup> satisfies a universal HyperLTL formula <sup>∀</sup>π<sup>1</sup> ... <sup>∀</sup>πn. ϕ if the traces generated by the self-composition <sup>S</sup><sup>n</sup> are a subset of <sup>L</sup>(Aϕ).

In more detail, the algorithm searches for a transition system S<sup>∃</sup> = X, x0, μ, l∃ such that the run graph of <sup>S</sup><sup>n</sup>, <sup>S</sup><sup>m</sup> || S∃, and <sup>A</sup>ϕ, written <sup>S</sup><sup>n</sup> <sup>×</sup> (S<sup>m</sup> || S∃) × Aϕ, is accepting. Formally, given a <sup>Γ</sup>-labeled <sup>Υ</sup>-transition system <sup>S</sup> <sup>=</sup> S, s0,τ,l and a universal co-B¨uchi automaton <sup>A</sup><sup>ϕ</sup> <sup>=</sup> Q, q0, δ, F, where <sup>δ</sup> : <sup>Q</sup> <sup>×</sup> <sup>Υ</sup> <sup>n</sup>+<sup>m</sup> <sup>×</sup> <sup>Γ</sup> <sup>n</sup>+<sup>m</sup> <sup>→</sup> <sup>2</sup>Q, the run graph <sup>S</sup><sup>n</sup> <sup>×</sup> (S<sup>m</sup> || S∃) × A<sup>ϕ</sup> is the directed graph (V,E), with the set of vertices <sup>V</sup> <sup>=</sup> <sup>S</sup><sup>n</sup> <sup>×</sup> <sup>S</sup><sup>m</sup> <sup>×</sup> <sup>X</sup> <sup>×</sup> <sup>Q</sup>, initial vertex <sup>v</sup>*init* = ((s0,...,s0),(s0,...,s0), x0, q0) and the edge relation <sup>E</sup> <sup>⊆</sup> <sup>V</sup> <sup>×</sup> <sup>V</sup> satisfying ((*sn* , *sm* , x, q),(*s- <sup>n</sup>* , *<sup>s</sup>- <sup>m</sup>* , x , q )) <sup>∈</sup> <sup>E</sup> if, and only if

$$\begin{split} \exists \boldsymbol{\upsilon} \in \mathcal{T}^{n}. \; \; \left( \begin{aligned} & \mathbf{s}\_{n} \xrightarrow[\tau\_{n}]{\boldsymbol{v}} \mathbf{s}\_{n}' \end{aligned} \right) \land \left( \begin{aligned} & \mathbf{s}\_{m} \xrightarrow[\tau\_{m}]{l \exists (x)} \mathbf{s}\_{m}' \end{aligned} \right) \land \left( \begin{aligned} & \mathbf{z} \xrightarrow[\mu]{\boldsymbol{v}} \mathbf{z}' \end{aligned} \right) \\ & \qquad \land q' \in \delta(q, \boldsymbol{\upsilon} \cdot l\_{\exists}(x), l\_{n}(\mathbf{s}\_{n}) \cdot l\_{m}(\mathbf{s}\_{m})). \end{aligned}$$

**Theorem 7.** *Given* <sup>S</sup>*,* <sup>S</sup>∃*, and a HyperLTL formula* <sup>∀</sup><sup>n</sup>∃<sup>m</sup><sup>ϕ</sup> *where* <sup>ϕ</sup> *is quantifier-free. Let* <sup>A</sup><sup>ϕ</sup> *be the universal co-B¨uchi automaton for* <sup>ϕ</sup>*. If the run graph* <sup>S</sup><sup>n</sup> <sup>×</sup> (S<sup>m</sup> || S∃) × A<sup>ϕ</sup> *is accepting, then* <sup>S</sup> -<sup>∀</sup><sup>n</sup>∃<sup>m</sup>ϕ*.*

*Proof.* Follows from Theorem <sup>4</sup> and the fact that <sup>A</sup><sup>ϕ</sup> represents <sup>L</sup>(ϕ).

The acceptance of a run graph is witnessed by an annotation <sup>λ</sup>: <sup>V</sup> <sup>→</sup> <sup>N</sup>∪{⊥} which is a function mapping every reachable vertex <sup>v</sup> <sup>∈</sup> <sup>V</sup> in the run graph to a natural number <sup>λ</sup>(v), i.e., <sup>λ</sup>(v) <sup>=</sup> <sup>⊥</sup>. Intuitively, <sup>λ</sup>(v) returns the number of visits to rejecting states on any path from the initial vertex v*init* to v. If we can bound this number for every reachable vertex, the annotation is *valid* and the run graph is accepting. Formally, an annotation λ is valid, if (1) the initial state is reachable (λ(v*init*) <sup>=</sup> <sup>⊥</sup>) and (2) for every (v, v ) <sup>∈</sup> <sup>E</sup> with <sup>λ</sup>(v) <sup>=</sup> <sup>⊥</sup> it holds that λ(v ) <sup>=</sup> <sup>⊥</sup> and <sup>λ</sup>(v) <sup>λ</sup>(v ) where is <sup>&</sup>gt; if <sup>v</sup> is rejecting and <sup>≥</sup> otherwise. Such an annotation exists if, and only if, the run graph is accepting [20].

We encode the search for <sup>S</sup><sup>∃</sup> and the annotation <sup>λ</sup> as an SMT constraint system. Therefore, we use uninterpreted function symbols to encode <sup>S</sup><sup>∃</sup> and <sup>λ</sup>. A transition system S is represented in the constraint system by two functions, the transition function <sup>τ</sup> : <sup>S</sup> <sup>×</sup> <sup>Υ</sup> <sup>→</sup> <sup>S</sup> and the labeling function <sup>l</sup>: <sup>S</sup> <sup>→</sup> <sup>Γ</sup>. The annotation is split into two parts, a reachability constraint <sup>λ</sup><sup>B</sup> : <sup>V</sup> <sup>→</sup> <sup>B</sup> indicating whether a state in the run graph is reachable and a counter <sup>λ</sup># : <sup>V</sup> <sup>→</sup> <sup>N</sup> that maps every reachable vertex v to the maximal number of rejecting states λ#(v) visited by any path from the initial vertex to v. The resulting constraint asserts that there is a transition system S<sup>∃</sup> with an accepting run graph. Note, that the functions representing the system <sup>S</sup> (<sup>τ</sup> : <sup>S</sup> <sup>×</sup> <sup>Υ</sup> <sup>→</sup> <sup>S</sup> and <sup>l</sup>: <sup>S</sup> <sup>→</sup> <sup>Γ</sup>) are given, that is, they are interpreted.

<sup>∃</sup>λ<sup>B</sup> : <sup>S</sup><sup>n</sup> <sup>×</sup> <sup>S</sup><sup>m</sup> <sup>×</sup> <sup>X</sup> <sup>×</sup> <sup>Q</sup> <sup>→</sup> <sup>B</sup>. <sup>∃</sup>λ<sup>N</sup> : <sup>S</sup><sup>n</sup> <sup>×</sup> <sup>S</sup><sup>m</sup> <sup>×</sup> <sup>X</sup> <sup>×</sup> <sup>Q</sup> <sup>→</sup> <sup>N</sup>. <sup>∃</sup>μ: <sup>X</sup> <sup>×</sup> <sup>Υ</sup> <sup>n</sup> <sup>→</sup> X. <sup>∃</sup>l<sup>∃</sup> : <sup>X</sup> <sup>→</sup> <sup>Υ</sup> <sup>m</sup> <sup>∀</sup>*<sup>υ</sup>* <sup>∈</sup> <sup>Υ</sup> <sup>n</sup>. <sup>∀</sup>*sn* , *<sup>s</sup>- <sup>n</sup>* <sup>∈</sup> <sup>S</sup>n. <sup>∀</sup>*sm* , *<sup>s</sup>- <sup>m</sup>* <sup>∈</sup> <sup>S</sup>m. <sup>∀</sup>q, q <sup>∈</sup> Q. <sup>∀</sup>x, x <sup>∈</sup> X. <sup>λ</sup><sup>B</sup>((s0,...,s0),(s0,...,s0), x0, q0) <sup>∧</sup> <sup>λ</sup><sup>B</sup>(*sn* , *sm* , x, q) <sup>∧</sup> <sup>q</sup> <sup>∈</sup> <sup>δ</sup>(q,(*<sup>υ</sup>* · <sup>l</sup>∃(x)),(<sup>l</sup> ◦ (*sn* · *sm* ))) <sup>∧</sup> <sup>x</sup> <sup>=</sup> <sup>μ</sup>(x, *<sup>υ</sup>*) ∧ *s- <sup>n</sup>* <sup>=</sup> <sup>τ</sup>n(*sn* , *<sup>υ</sup>*) <sup>∧</sup> *<sup>s</sup>- <sup>m</sup>* <sup>=</sup> <sup>τ</sup>m(*sm* , l∃(x)) <sup>⇒</sup> <sup>λ</sup><sup>B</sup>(*s- <sup>n</sup>* , *<sup>s</sup>- <sup>m</sup>* , x , q ) <sup>∧</sup> <sup>λ</sup><sup>N</sup>(*sn* , *sm* , x, q) <sup>λ</sup><sup>N</sup>(*s- <sup>n</sup>* , *<sup>s</sup>- <sup>m</sup>* , x , q )

where is <sup>&</sup>gt; if <sup>q</sup> <sup>∈</sup> <sup>F</sup> and <sup>≥</sup> otherwise. The *bounded synthesis algorithm* increases the bound of the strategy S<sup>∃</sup> until either the constraints system becomes satisfiable, or a given upper bound is reached. In the case the constraint system is satisfiable, we can extract interpretations for the functions <sup>μ</sup> and <sup>l</sup><sup>∃</sup> using a solver that is able to produce models. These functions then represent the synthesized transition system S∃.

**Corollary 2.** *Given* <sup>S</sup> *and a HyperLTL formula* <sup>∀</sup>∗∃∗<sup>ϕ</sup> *where* <sup>ϕ</sup> *is quantifierfree. If the constraint system is satisfiable for some bound on the size of* S<sup>∃</sup> *then* S -<sup>∀</sup><sup>∗</sup>∃<sup>∗</sup>ϕ*.*

*Proof.* Follows immediately by Theorem 7.

As the decision problem is decidable, we know that there is an upper bound on the size of a realizing S<sup>∃</sup> and, thus, the bounded synthesis approach is a decision procedure for the strategy realizability problem.

**Corollary 3.** *The bounded synthesis algorithm decides the strategy realizability problem for* ∀<sup>∗</sup>∃<sup>∗</sup> *HyperLTL.*

*Proof.* The existence of such an upper bound follows from Theorem 6.

**Approximating Prophecy.** We introduce a new parameter to the strategy synthesis problem to approximate the information about the future that can be captured using prophecy variables. This bound represents a constant *lookahead* into future choices made by the environment. In other words, for a given <sup>k</sup> <sup>≥</sup> 0, the strategy <sup>S</sup><sup>∃</sup> is allowed to depend on choices of the <sup>∀</sup>-player in the next <sup>k</sup> steps. While constant lookahead is only an approximation of infinite clairvoyance, it suffices for many practical situations as shown by prior case studies [9,18].

We present a solution to synthesizing transition systems with constant lookahead for <sup>k</sup> <sup>≥</sup> 0 using bounded synthesis. To simplify the presentation, we present the stand-alone problem with respect to a specification given as a universal co-B¨uchi automaton. The integration into the constraint system for the ∀<sup>∗</sup>∃<sup>∗</sup> HyperLTL synthesis as presented in the previous section is then straightforward. First, we present an extension to the transition system model that incorporates the notion of constant lookahead. The idea of this extension is to replace the initial state <sup>s</sup><sup>0</sup> by a function *init* : <sup>Υ</sup> <sup>k</sup> <sup>→</sup> <sup>S</sup> that maps input sequences of length k to some state. Thus, the transition system observes the first k inputs, chooses some initial state based on those inputs, and then progresses with the same pace as the input sequence. Next, we define the run graph of such a system <sup>S</sup><sup>k</sup> <sup>=</sup> S, *init*,τ,l and an automaton <sup>A</sup> <sup>=</sup> Q, q0, δ, F, where <sup>δ</sup> : <sup>Q</sup>×<sup>Υ</sup> <sup>×</sup><sup>Γ</sup> <sup>→</sup> <sup>Q</sup>, as the directed graph (V,E) with the set of vertices <sup>V</sup> <sup>=</sup> <sup>S</sup> <sup>×</sup> <sup>Q</sup> <sup>×</sup> <sup>Υ</sup> <sup>k</sup>, the initial vertices (s, q0, *<sup>υ</sup>*) <sup>∈</sup> <sup>V</sup> such that <sup>s</sup> <sup>=</sup> *init*(*υ*) for every *<sup>υ</sup>* <sup>∈</sup> <sup>Υ</sup> <sup>k</sup>, and the edge relation <sup>E</sup> <sup>⊆</sup> <sup>V</sup> <sup>×</sup> <sup>V</sup> satisfying ((s, q, υ1υ<sup>2</sup> ··· <sup>υ</sup>k),(s , q , υ 1υ <sup>2</sup> ··· <sup>υ</sup> <sup>k</sup>)) <sup>∈</sup> <sup>E</sup> if, and only if

$$\exists \upsilon\_{k+1} \in \mathcal{T}. s \xrightarrow{\upsilon\_{k+1}} s' \land q' \in \delta(q, \upsilon\_1, l(s)) \land \bigwedge\_{1 \le i \le k} \upsilon'\_i = \upsilon\_{i+1}.$$

**Lemma 1.** *Given a universal co-B¨uchi automaton* <sup>A</sup> *and a* <sup>k</sup>*-lookahead transition system* Sk*.* S<sup>k</sup> -A *if, and only if, the run graph* S<sup>k</sup> × A *is accepting.*

Finally, synthesis amounts to solving the following constraint system:

$$\begin{split} & \exists \lambda^{\mathbb{B}} : S \times Q \times \mathcal{T}^{k} \to \mathbb{B}. \exists \lambda^{\mathbb{N}} : S \times Q \times \mathcal{T}^{k} \to \mathbb{N}. \\ & \exists init \colon \mathcal{T}^{k} \to S. \exists \tau \colon S \times \mathcal{T} \to S. \exists l \colon S \to \Gamma. \\ & \left( \forall \upsilon \in \mathcal{T}^{k} . \lambda^{\mathbb{B}}(init(\upsilon), q\_{0}, \upsilon) \right) \wedge \\ & \forall \upsilon\_{1} \upsilon\_{2} \cdots \upsilon\_{k+1} \in \mathcal{T}^{k+1} . \forall s, s' \in S. \forall q, q' \in Q. \\ & \left( \lambda^{\mathbb{B}}(s, q, \upsilon\_{1} \cdots \upsilon\_{k}) \wedge s' = \tau(s, \upsilon\_{k+1}) \wedge q' \in \delta(q, \upsilon\_{1}, l(s)) \right) \\ & \Rightarrow \lambda^{\mathbb{B}}(s', q', \upsilon\_{2} \cdots \upsilon\_{k+1}) \wedge \lambda^{\mathbb{N}}(s, q, \upsilon\_{1} \cdots \upsilon\_{k}) \succeq \lambda^{\mathbb{N}}(s', q', \upsilon\_{2} \cdots \upsilon\_{k+1}) \end{split}$$

**Corollary 4.** *Given some* <sup>k</sup> <sup>≥</sup> <sup>0</sup>*, if the constraint system is satisfiable for some bound on the size of* S<sup>k</sup> *then* S<sup>k</sup> -A*.*

#### **4 Synthesis with Quantifier Alternations**

We now build on the introduced techniques to solve the *synthesis* problem for HyperLTL with quantifier alternation, that is, we search for implementations that satisfy the given properties. In previous work [13], the synthesis problem for ∃<sup>∗</sup>∀<sup>∗</sup> HyperLTL was solved by a reduction to the distributed synthesis problem. We present an alternative synthesis procedure that (1) introduces the necessary concepts for the synthesis of the ∀<sup>∗</sup>∃<sup>∗</sup> fragment and that (2) strictly decomposes the choice of the existential trace quantifier from the implementation.

Fix a formula of the form <sup>∃</sup><sup>m</sup>∀<sup>n</sup>ϕ. We again reduce the verification problem to the problem of determining whether a run graph is accepting. As the existential quantifiers do not depend on the universal ones, there is no future dependency and thus no need for prophecy variables or bounded lookahead. Formally, S<sup>∃</sup> is a tuple X, x0, μ, l∃ such that <sup>X</sup> is a set of states, <sup>x</sup><sup>0</sup> <sup>∈</sup> <sup>X</sup> is the designated initial state, <sup>μ</sup>: <sup>X</sup> <sup>→</sup> <sup>X</sup> is the transition function, and <sup>l</sup><sup>∃</sup> : <sup>X</sup> <sup>→</sup> <sup>Υ</sup> <sup>m</sup> is the labeling function. <sup>S</sup><sup>∃</sup> produces infinite sequences of (<sup>Υ</sup> <sup>m</sup>)<sup>ω</sup>, without having any knowledge about the behavior of the universally quantified traces. The run graph is then (S<sup>m</sup> || S∃) × S<sup>n</sup> × Aϕ. The constraint system is built analogously to Sect. 3.2, with the difference that the representation of the system S is now also uninterpreted. In the resulting SMT constraint system, we have two bounds, one for the size of the implementation S and one for the size of S∃.

**Corollary 5.** *The bounded synthesis algorithm decides the realizability problem for* <sup>∃</sup><sup>∗</sup>∀<sup>1</sup> *HyperLTL and is a semi-decision procedure for* <sup>∃</sup><sup>∗</sup>∀<sup>&</sup>gt;<sup>1</sup> *HyperLTL.*

The synthesis problem for formulas in the ∀<sup>∗</sup>∃<sup>∗</sup> HyperLTL fragment uses the same reduction to a constraint system as the strategy synthesis in Sect. 3.2, with the only difference that the transition system S itself is uninterpreted. In the resulting SMT constraint systems, we have three bounds, the size of the implementation <sup>S</sup>, the size of the strategy <sup>S</sup>∃, and the lookahead <sup>k</sup>.

**Fig. 1.** HyperLTL model checking with MCHyper

**Corollary 6.** *Given a HyperLTL formula* <sup>∀</sup><sup>n</sup>∃<sup>m</sup><sup>ϕ</sup> *where* <sup>ϕ</sup> *is quantifier-free.* <sup>∀</sup><sup>n</sup>∃<sup>m</sup><sup>ϕ</sup> *is realizable if the SMT constraint system corresponding to the run graph* <sup>S</sup><sup>n</sup> <sup>×</sup> (S<sup>m</sup> || S∃) × A<sup>ϕ</sup> *is satisfiable for some bounds on* <sup>S</sup>*,* <sup>S</sup>∃*, and lookahead* <sup>k</sup>*.*

#### **5 Implementations and Experimental Evaluation**

We have integrated the model checking technique with a manually provided strategy into the HyperLTL hardware model checker MCHyper<sup>1</sup>. For the synthesis of strategies and reactive systems from hyperproperties, we have developed a separate bounded synthesis tool based on SMT-solving. In the following, we describe these implementations and report on experimental results. All experiments ran on a machine with dual-core Core i7, 3.3 GHz, and 16 GB memory.

**Hardware Model Checking with Given Strategies.** We have extended the model checker MCHyper [18] from the alternation-free fragment to formulas with one quantifier alternation. The input to MCHyper is a circuit description as an And-Inverter-Graph in the Aiger format and a HyperLTL formula. Figures 1a and <sup>1</sup> show the model checking process in MCHyper without and with quantifier alternation, respectively. For formulas with quantifier alternation, the model checker now also accepts a strategy as an additional Aiger circuit <sup>C</sup>σ. Based on this strategy, MCHyper creates a new circuit where only the inputs of the universal system copies are exposed and the inputs of the existential system

<sup>1</sup> Try the online tool interface with the latest version of MCHyper: https://www. react.uni-saarland.de/tools/online/MCHyper/.


**Table 1.** Experimental results for MCHyper on the software doping and mutual exclusion benchmarks. All experiments used the IC3 option for abc. Model and property names correspond to the ones used in [9] and [18].

copies are determined by the strategy. The new circuit is then model checked as described in [18] with abc [4].

We evaluate our extension of MCHyper on formulas with quantifier alternation based on benchmarks from software doping [9] and symmetry in mutual exclusion algorithms [18]. Both considered problems have previously been analyzed with MCHyper; however, since the properties in both problems require quantifier alternation, we were previously limited to a (manually obtained) approximation of the properties as universal formulas. The correctness of manual approximations is not given but has to be shown separately. By directly model checking the formula with quantifier alternation we know that we are checking the correct formula without needing any additional proof of correctness.

*Software Doping.* D'Argenio et al. [9] examined a clean and a doped version of an emission control program of a car and used the previous version of MCHyper to formally verify approximations of these properties. Robust cleanness is expressed in the one-alternation fragment using two <sup>∀</sup><sup>2</sup>∃<sup>1</sup> HyperLTL formulas (given in Prop. 19 in [9], cf. Sect. 1). In [9], the formulas were strengthened into alternation-free formulas that imply the original properties. Despite the quantifier alternation, Table <sup>1</sup> shows that the new version of MCHyper verifies the precise formulas in roughly the same time as the alternation-free approximations [9] while giving stronger correctness guarantees.

*Symmetry in Mutual Exclusion Protocols.* ∀<sup>∗</sup>∃<sup>∗</sup> HyperLTL allows us to specify symmetry for mutual exclusion protocols. In such protocols, we wish to guarantee that every request is eventually answered, and the grants are mutually exclusive. In our experiments, we used an implementation of the Bakery protocol [25]. Table <sup>1</sup> shows the verification results for the precise <sup>∀</sup><sup>1</sup>∃<sup>1</sup> properties. Comparing these results to the performance on the approximations of the symmetry properties [18], we, again, observe that the verification times are similar. However, we gain the additional correctness guarantees as described above.

**Strategy and System Synthesis.** For the synthesis of strategies for existential quantifiers and for the synthesis of reactive systems from hyperproperties, we have developed a separate bounded synthesis tool based on SMT-solving with z3 [29]. Our evaluation is based on two benchmark families, the *dining cryptographers* problem [5] and a simplified version of the symmetry problem in mutual exclusion protocols discussed previously. The results are shown in Table 2. Obviously, synthesis operates at a vastly smaller scale than model checking with given strategies. In the dining cryptographers example, z3 was unable to find an implementation for the full synthesis problem, but could easily synthesize strategies for the existential trace quantifiers when provided with an implementation. With the progress of constraint solver that employ quantification over Boolean functions [31] we expect scalability improvements of our synthesis approach.


**Table 2.** Summary of the experimental results on the benchmarks sets described in Sect. 5. When no hyperproperty is given, only the LTL part is used.

#### **6 Conclusions**

We have presented model checking and synthesis techniques for hyperliveness properties expressed as HyperLTL formulas with quantifier alternation. The alternation makes it possible to specify hyperproperties such as generalized noninterference, symmetry, and deniability. Our approach is the first method for the synthesis of reactive systems from HyperLTL formulas with quantifier alternation and the first practical method for the verification of such specifications.

The approach is based on a game-theoretic view of existential quantifiers, where the ∃-player reacts to decisions of the ∀-player. The key advantage is that the complementation of the system automaton is avoided (cf. [18]). Instead, a strategy must be found for the ∃-player. Since this can be done either manually or through automatic synthesis, the user of the model checking or synthesis tool has the opportunity to trade some automation for a significant gain in performance.

**Acknowledgements.** We would like to thank Sebastian Biewer for providing the software doping models and formulas, Marvin Stenger for his advice on our synthesis experiments, and Jana Hofmann for her helpful comments on a draft of this paper.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Quantitative Mitigation of Timing Side Channels**

Saeid Tizpaz-Niari(B), Pavol Cern´ <sup>ˇ</sup> y, and Ashutosh Trivedi

University of Colorado Boulder, Boulder, USA Saeid.TizpazNiari@colorado.edu

**Abstract.** Timing side channels pose a significant threat to the security and privacy of software applications. We propose an approach for *mitigating* this problem by decreasing the strength of the side channels as measured by entropy-based objectives, such as min-guess entropy. Our goal is to minimize the information leaks while guaranteeing a user-specified maximal acceptable performance overhead. We dub the decision version of this problem *Shannon mitigation*, and consider two variants, *deterministic* and *stochastic*. First, we show that the deterministic variant is NP-hard. However, we give a polynomial algorithm that finds an optimal solution from a restricted set. Second, for the stochastic variant, we develop an approach that uses optimization techniques specific to the entropy-based objective used. For instance, for min-guess entropy, we used mixed integer-linear programming. We apply the algorithm to a threat model where the attacker gets to make *functional observations*, that is, where she observes the running time of the program for the same secret value combined with different public input values. Existing mitigation approaches do not give confidentiality or performance guarantees for this threat model. We evaluate our tool Schmit on a number of micro-benchmarks and real-world applications with different entropybased objectives. In contrast to the existing mitigation approaches, we show that in the functional-observation threat model, Schmit is scalable and able to maximize confidentiality under the performance overhead bound.

#### **1 Introduction**

Information leaks through timing side channels remain a challenging problem [13,16,24,29,35,37,47]. A program leaks secret information through timing side channels if an attacker can deduce secret values (or their properties) by observing response times. We consider the problem of mitigating timing side channels. Unlike elimination techniques [7,31,46] that aim to completely remove timing leaks without considering the performance penalty, the goal of mitigation techniques [10,26,48] is to weaken the leaks, while keeping the penalty low.

We define the *Shannon mitigation* problem that decides whether there is a mitigation policy to achieve a lower bound on a given security entropy-based measure while respecting an upper bound on the performance overhead. Consider an example where the program-under-analysis has a secret variable with seven possible values, and has three different timing behaviors, each forming a cluster of secret values. It takes 1 second if the secret value is 1, it takes 5 seconds if the secret is between 2 and 5, and it takes 10 seconds if the secret value is 6 or 7. The *entropy-based measure* quantifies the remaining uncertainty about the secret after timing observations. Min-guess entropy [11,25,41] for this program is 1, because if the observed execution time is 1, the attacker guesses the secret in one try. A *mitigation policy* involves merging some timing clusters by introducing delays. A good solution might be to introduce a 9 second delay if the secret is 1, which merges two timing clusters. But, this might be disallowed by the budget on the performance overhead. Therefore, another solution must be found, such as introducing a 4 seconds delay when the secret is one.

We develop two variants of the Shannon mitigation problem: *deterministic* and *stochastic*. The mitigation policy of the deterministic variant requires us to move all secret values associated to an observation to another observation, while the policy of the stochastic variant allows us to move only a portion of secret values in an observation to another one. We show that the deterministic variant of the Shannon mitigation problem is intractable and propose a dynamic programming algorithm to approximate the optimal solution for the problem by searching through a restricted set of solutions. We develop an algorithm that reduces the problem in the stochastic variant to a well-known optimization problem that depends on the entropy-based measure. For instance, with minguess entropy, the optimization problem is mixed integer-linear programming.

We consider a threat model where an attacker knows the public inputs (known-message attacks [26]), and furthermore, where the public input changes much more often than the secret inputs (for instance, secrets such as bank account numbers do not change often). As a result, for each secret, the attacker observes a timing function of the public inputs. We call this model *functional observations* of timing side channels.

We develop our tool Schmit that has three components: side channel discovery [45], search for the mitigation policy, and the policy enforcement. The side channel discovery builds the functional observations [45] and measures the entropy of secret set after the observations. The mitigation policy component includes the implementation of the dynamic programming and optimization algorithms. The enforcement component is a monitoring system that uses the program internals and functional observations to enforce the policy at runtime. To summarize, we make the following contributions:


**Fig. 1.** (a) The example used in Sect. 2. (b) The timing functions for each secret value of the program.

#### **2 Overview**

First, we describe the threat model considered in this paper. Second, we describe our approach on a running example. Third, we compare the results of Schmit with the existing mitigation techniques [10,26,48] and show that Schmit achieves the highest entropy (i.e., best mitigation) for all three entropy objectives.

**Threat Model.** We assume that the attacker has access to the source code and the mitigation model, and she can sample the run-time of the application arbitrarily many times on her own machine. During an attack, she intends to guess a fixed secret of the target machine by observing the mitigated running time. Since we consider the attack models where the attacker knows the public inputs and the secret inputs are less volatile than public inputs, her observations are functional observations, where for each secret value, she learns a function from the public inputs to the running time.

**Example 2.1.** Consider the program shown in Fig. 1(a). It takes secret and public values as inputs. The running time depends on the number of set bits in both secret and public inputs. We assume that secret and public inputs can be between 1 and 1023. Figure 1(b) shows the running time of different secret values as timing functions, i.e., functions from the public inputs to the running time.

**Side channel discovery.** One can use existing tools to find the initial functional observations [44,45]. In Example 2.1, functional observations are F = y, 2y, ..., 10y where y is a variable whose value is the number of set bits in the public input. The corresponding secret classes after this observation is S<sup>F</sup> = -<sup>1</sup>1, <sup>1</sup>2, <sup>1</sup>3,..., <sup>1</sup>10 where 1n shows a set of secret values that have <sup>n</sup> set bits. The sizes of classes are B = {10, 45, 120, 210, 252, 210, 120, 45, 10, 1}. We use L1 norm as metric to calculate the distance between the functional observations F. This distance (penalty) matrix specifies extra performance overhead to move from one functional observation to another. With the assumption of uniform distributions over the secret input, Shannon entropy, guessing entropy, and the min-guessing entropy are 7.3, 90.1, and 1.0, respectively. These entropies are defined in Sect. 3 and measure the remaining entropy of the secret set after the observations. We aim to maximize the entropy measures, while keeping the performance overhead below a threshold, say 60% for this example.

**Mitigation with Schmit.** We use our tool Schmit to mitigate timing leaks of Example 2.1. The mitigation policy for the Shannon entropy objective is shown in Fig. 2(a). The policy results in two classes of observations. The policy requires to move functional observations y, 2y,..., 5y to -6y and all other observations -7y, 8y, 9y to -10y. To enforce this policy, we use a monitoring system at runtime. The monitoring system uses a decision tree model of the initial functional observations. The decision tree model characterizes each functional observation with associated program internals such as method calls or basic block invocations [43,44]. The decision tree model for the Example 2.1 is shown in Fig. 2(b). The monitoring system records program internals and matches it with the decision tree model to detect the current functional observation. Then, it adds delays, if necessary, to the execution time in order to enforce the mitigation policy. With this method, the mitigated functional observation is G = -6y, 10y and the secret

**Fig. 2.** (a) Mitigation policy calculation with deterministic algorithm (left). The observations x1 and x2 stands for all observations from C2*−*C<sup>5</sup> and from C8*−*C9, resp.; (b) Leaned discriminant decision tree (center): it characterizes the functional clusters of Fig. 1(b) with internals of the program in Fig. 1(a); and (c) observations (right) after the mitigation by Schmit results in two classes of observations.

class is S<sup>G</sup> = -{11, 12, 13, 14, 15, 16}, {17, 18, 19, 110} as shown in Fig. 2 (c). The performance overhead of this mitigation is 43.1%. The Shannon, guessing, and min-guess entropies have improved to 9.7, 459.6, and 193.5, respectively.

**Comparison with state of the art.** We compare our mitigation results to black-box mitigation scheme [10] and bucketing [26]. *Black-box double scheme technique.* We use the double scheme technique [10] to mitigate the leaks of Example 2.1. This mitigation uses a prediction model to release events at scheduled times. Let us consider the prediction for releasing the event <sup>i</sup> at <sup>N</sup>-th epoch with <sup>S</sup>(N, i) = max(inpi, S(N, i−1))+p(N), where inpi is the time arrival of the i-th request, S(N, i − 1) is the prediction for the request <sup>i</sup>−1, and <sup>p</sup>(N)=2N−<sup>1</sup> models the basis for the prediction scheme at N-th epoch. We assume that the request are the same type and the sequence of public input requests for each secret are received in the begining of epoch N = 1. Figure 3(a) shows the functional observations after applying the predictive mitigation. With this mitigation, the classes of observations are S<sup>G</sup> = -11, {12, 13}, {14, 15, 16, 17}, {18, 19, 110}. The number of classes of observations is reduced from 10 to 4. The performance overhead is 39.9%. The Shannon, guessing, and min-guess entropies have increased to 9.00, 321.5, and 5.5, respectively. *Bucketing.* We consider the mitigation approach with buckets [26]. For Example 2.1, if the attacker does not know the public input (unknown-message attacks [26]), the observations are {1.1, 2.1, 3.3, ··· , 9.9, 10.9, ··· , 109.5} as shown in Fig. 3(b). We apply the bucketing algorithm in [26] for this observations, and it finds two buckets {37.5, 109.5} shown with the red lines in Fig. 3(b). The bucketing mitigation requires to move the observations to the closet bucket. Without functional observations, there are 2 classes of observations. However, with functional observations, there are more than 2 observations. Figure 3(c) shows how the pattern of observations are leaking through functional side channels. There are 7 classes of observations: S<sup>G</sup> = -{11, 12, 13}, {14}, {15}, {16}, {17}, {18}, {19}, {110}. The Shannon, guessing, and min-guess entropies are 7.63, 102.3, and 1.0, respectively.

**Fig. 3.** (a) The execution time after mitigation using the double scheme technique [10]. There are four classes of functional observations after the mitigation. (b) Mitigation with bucketing [26]. All observations require to move to the closet red line. (c) Functional observations distinguish 7 classes of observations after mitigating with bucketing.

Overall, Schmit achieves the higher entropy measures for all three objectives under the performance overhead of 60%.

#### **3 Preliminaries**

For a finite set Q, we use |Q| for its cardinality. A *discrete probability distribution* - , or just distribution, over a set Q is a function d : Q→[0, 1] such that q∈Q <sup>d</sup>(q) = 1. Let <sup>D</sup>(Q) denote the set of all discrete distributions over <sup>Q</sup>. We say a distribution d ∈ D(Q) is a *point distribution* if d(q)=1 for a q ∈ Q. Similarly, a distribution d ∈ D(Q) is *uniform* if d(q)=1/|Q| for all q ∈ Q.

**Definition 1 (Timing Model).** *The timing model of a program* P *is a tuple* [[P]] = (X, Y, <sup>S</sup>, δ) *where* <sup>X</sup> <sup>=</sup> {x1,...,xn} *is the set of secret-input variables,* <sup>Y</sup> <sup>=</sup> {y1,...,ym} *is the set of public-input variables,* S ⊆ <sup>R</sup><sup>n</sup> *is a finite set of secret-inputs, and* <sup>δ</sup> : <sup>R</sup><sup>n</sup> <sup>×</sup> <sup>R</sup><sup>m</sup> <sup>→</sup> <sup>R</sup>≥<sup>0</sup> *is the execution-time function of the program over the secret and public inputs.*

We assume that the adversary knows the program and wishes to learn the value of the secret input. To do so, for some fixed secret value s ∈ S, the adversary can invoke the program to estimate (to an arbitrary precision) the execution time of the program. If the set of public inputs is empty, i.e. m = 0, the adversary can only make *scalar observations* of the execution time corresponding to a secret value. In the more general setting, however, the adversary can arrange his observations in a functional form by estimating an approximation of the *timing function* <sup>δ</sup>(s) : <sup>R</sup><sup>m</sup> <sup>→</sup> <sup>R</sup>≥<sup>0</sup> of the program.

A *functional observation* of the program P for a secret input s ∈ S is the function <sup>δ</sup>(s) : <sup>R</sup><sup>m</sup> <sup>→</sup> <sup>R</sup>≥<sup>0</sup> defined as **<sup>y</sup>** <sup>∈</sup> <sup>R</sup><sup>m</sup> → <sup>δ</sup>(s, **<sup>y</sup>**). Let F ⊆ [R<sup>m</sup> <sup>→</sup> <sup>R</sup>≥<sup>0</sup>] be the finite set of all functional observations of the program P. We define an order ≺ over the functional observations F: for f,g ∈ F we say that f ≺ g if <sup>f</sup>(y) <sup>≤</sup> <sup>g</sup>(y) for all <sup>y</sup> <sup>∈</sup> <sup>R</sup>m.

The set F characterizes an equivalence relation ≡<sup>F</sup> , namely secrets with equivalent functional observations, over the set S, defined as following: s ≡<sup>F</sup> s if there is an f ∈ F such that δ(s) = δ(s ) = f. Let S<sup>F</sup> = -<sup>S</sup>1, S2,...,Sk be the quotient space of S characterized by the observations F = <sup>f</sup>1, f2,...,fk. We write <sup>S</sup>f for the secret set <sup>S</sup> ∈ S<sup>F</sup> corresponding to the observations <sup>f</sup> ∈ F. Let B = -<sup>B</sup>1, B2,...,Bk be the size of observational equivalence class in <sup>S</sup><sup>F</sup> , i.e. <sup>B</sup><sup>i</sup> <sup>=</sup> |Sf*<sup>i</sup>* <sup>|</sup> for <sup>f</sup><sup>i</sup> ∈ F and let <sup>B</sup> <sup>=</sup> |S| <sup>=</sup> k i=1 <sup>B</sup>i.

Shannon entropy, guessing entropy, and min-guess entropy are three prevalent information metrics to quantify information leaks in programs. K¨opf and Basin [25] characterize expressions for various information-theoretic measures on information leaks when there is a uniform distribution on S given below.

**Proposition 1 (K¨opf and Basin** [25]**).** *Let* F = <sup>f</sup>1,...,fk *be a set of observations and let* S *be the set of secret values. Let* B = -<sup>B</sup>1,...,Bk *be the* *corresponding size of secret set in each class of observation and* B = k i=1 <sup>B</sup>i*. Assuming a uniform distribution on* S*, entropies can be characterized as:*

*1.* **Shannon Entropy:** *SE*(S|F) def = ( <sup>1</sup> B ) - <sup>1</sup>≤i≤k <sup>B</sup><sup>i</sup> log2(Bi)*,*

*2.* **Guessing Entropy:** *GE*(S|F) def = ( <sup>1</sup> 2B ) - <sup>1</sup>≤i≤k <sup>B</sup><sup>2</sup> i <sup>+</sup> <sup>1</sup> <sup>2</sup> *, and*

*3.* **Min-Guess Entropy:** *mGE*(S|F) def = min<sup>1</sup>≤i≤k {(Bi + 1)/2}*.*

#### **4 Shannon Mitigation Problem**

Our goal is to mitigate the information leakage due to the timing side channels by adding synthetic delays to the program. An aggressive, but commonly-used, mitigation strategy aims to eliminate the side channels by adding delays such that every secret value yields a common functional observation. However, this strategy may often be impractical as it may result in unacceptable performance degradations of the response time. Assuming a well-known penalty function associated with the performance degradation, we study the problem of maximizing entropy while respecting a bound on the performance degradation. We dub the decision version of this problem Shannon mitigation.

Adding synthetic delays to execution-time of the program, so as to mask the side-channel, can give rise to new functional observations that correspond to upper-envelopes of various combinations of original observations. Let F = <sup>f</sup>1, f2,...,fk be the set of functional observations. For <sup>I</sup> <sup>⊆</sup> <sup>1</sup>, <sup>2</sup>,...,k, let <sup>f</sup><sup>I</sup> <sup>=</sup> **<sup>y</sup>** <sup>∈</sup> <sup>R</sup><sup>m</sup> → supi∈I <sup>f</sup>i(**y**) be the functional observation corresponding to upper-envelope of the functional observations in the set I. Let G(F) = {fI : <sup>I</sup> <sup>=</sup> ∅⊆{1, <sup>2</sup>,...,k}} be the set of all possible functional observations resulting from the upper-envelope calculations. To change the observation of a secret value with functional observation <sup>f</sup>i to a new observation <sup>f</sup>I (we assume that i ∈ I), we need to add delay function f<sup>i</sup> I : **<sup>y</sup>** <sup>∈</sup> <sup>R</sup><sup>m</sup> → <sup>f</sup><sup>I</sup> (y) <sup>−</sup> <sup>f</sup>i(y).

*Mitigation Policies.* Let G⊆G(F) be a set of admissible post-mitigation observations. A *mitigation policy* is a function μ : F→D(G) that for each secret <sup>s</sup> ∈ Sf suggests the probability distribution <sup>μ</sup>(f) over the functional observations. We say that a mitigation policy is *deterministic* if for all f ∈ F we have that μ(f) is a point distribution. Abusing notations, we represent a deterministic mitigation policy as a function μ : F→G. The semantics of a mitigation policy recommends to a program analyst a probability μ(f)(g) to elevate a secret input <sup>s</sup> ∈ Sf from the observational class <sup>f</sup> to the class <sup>g</sup> ∈ G by adding max {0, g(p) − f(p)} units delay to the corresponding execution-time δ(s, p) for all p ∈ Y . We assume that the mitigation policies respect the order, i.e. for every mitigation policy μ and for all f ∈ F and g ∈ G, we have that μ(f)(g) > 0 implies that f ≺ g. Let M(F→G) be the set of mitigation policies from the set of observational clusters F into the clusters G.

For the functional observations F = <sup>f</sup>1,...,fk and a mitigation policy μ ∈ M(F→G), the resulting observation set F[μ] ⊆ G is defined as:

$$\mathcal{F}[\mu] = \{ g \in \mathcal{G} \;:\; \text{there exists } f \in \mathcal{F} \text{ such that } \mu(f)(g) > 0 \}\;.$$

Since the mitigation policy is stochastic, we use average sizes of resulting observations to represent fitness of a mitigation policy. For F[μ] = <sup>g</sup>1, g2,...,g-, we define their expected class sizes <sup>B</sup>μ <sup>=</sup> -<sup>C</sup>1, C2,...,C as <sup>C</sup>i <sup>=</sup> i j=1 <sup>μ</sup>(f<sup>j</sup> )(fi)·B<sup>j</sup> (observe that -- i=1 <sup>C</sup><sup>i</sup> <sup>=</sup> <sup>B</sup>). Assuming a uniform distribution on <sup>S</sup>, various entropies for the expected class size after applying a policy μ ∈ M(F→G) can be characterized by the following expressions:


We note that the above definitions do not represent the expected entropies, but rather entropies corresponding to the expected cluster sizes. However, the three quantities provide bounds on the expected entropies after applying μ. Since Shannon and Min-Guess entropies are concave functions, from Jensen's inequality, we get that SE(S|F, μ) and mGE(S|F, μ) are upper bounds on expected Shannon and Min-Guess entropies. Similarly, GE(S|F, μ), being a convex function, give a lower bound on expected guessing entropy.

We are interested in maximizing the entropy while respecting constraints on the overall performance of the system. We formalize the notion of performance by introducing performance penalties: there is a function <sup>π</sup> : F×G → <sup>R</sup>≥<sup>0</sup> such that elevating from the observation f ∈ F to the functional observation g ∈ G adds an extra π(f,g) performance overheads to the program. The expected performance penalty associated with a policy μ, π(μ), is defined as the probabilistically weighted sum of the penalties, i.e. - f∈F,g∈G:f≺g |S<sup>f</sup> |·μ(f)(g)·π(f,g). Now, we introduce our key decision problem.

**Definition 2 (Shannon Mitigation).** *Given a set of functional observations* F = <sup>f</sup>1,...,fk*, a set of admissible post-mitigation observations* G⊆G(F)*, set of secrets* <sup>S</sup>*, a penalty function* <sup>π</sup> : F×G → <sup>R</sup>≥0*, a performance penalty upper bound* <sup>Δ</sup> <sup>∈</sup> <sup>R</sup>≥<sup>0</sup>*, and an entropy lower-bound* <sup>E</sup> <sup>∈</sup> <sup>R</sup>≥<sup>0</sup>*, the Shannon mitigation problem* Shan<sup>E</sup> (F, <sup>G</sup>, <sup>S</sup>, π, E,Δ)*, for a given entropy measure* E ∈ {*SE*,*GE*, *mGE*}*, is to decide whether there exists a mitigation policy* μ ∈ M(F→G) *such that* E(S|F, μ) ≥ E *and* π(μ) ≤ Δ*. We define the deterministic Shannon mitigation variant where the goal is to find a deterministic such policy.*

#### **5 Algorithms for Shannon Mitigation Problem**

#### **5.1 Deterministic Shannon Mitigation**

We first establish the intractability of the deterministic variant.

**Theorem 1.** *Deterministic Shannon mitigation problem is NP-complete.*

*Proof.* It is easy to see that the deterministic Shannon mitigation problem is in NP: one can guess a certificate as a deterministic mitigation policy μ ∈ M(F→G)

and can verify in polynomial time that it satisfies the entropy and overhead constraints. Next, we sketch the hardness proof for the min-guess entropy measure by providing a reduction from the *two-way partitioning* problem [28]. For the Shannon entropy and guess entropy measures, a reduction can be established from the Shannon capacity problem [18] and the Euclidean sum-of-squares clustering problem [8], respectively.

Given a set <sup>A</sup> <sup>=</sup> {a1, a2,...,ak} of integer values, the two-way partitioning problem is to decide whether there is a partition A1A<sup>2</sup> = A into two sets A<sup>1</sup> and A<sup>2</sup> with equal sums, i.e. - <sup>a</sup>∈A<sup>1</sup> <sup>a</sup> <sup>=</sup> - <sup>a</sup>∈A<sup>2</sup> <sup>a</sup>. W.l.o.g assume that <sup>a</sup><sup>i</sup> <sup>≤</sup> <sup>a</sup><sup>j</sup> for i ≤ j. We reduce this problem to a deterministic Shannon mitigation problem ShanmGE(FA, <sup>G</sup>A, <sup>S</sup>A, πA, EA, ΔA) with <sup>k</sup> clusters <sup>F</sup>A <sup>=</sup> <sup>G</sup>A <sup>=</sup> <sup>f</sup>1, f2,...,fk with the secret set <sup>S</sup>A <sup>=</sup> -<sup>S</sup>1, S2,...,Sk such that <sup>|</sup>Si<sup>|</sup> <sup>=</sup> <sup>a</sup>i. If - <sup>1</sup>≤i≤k <sup>a</sup><sup>i</sup> is odd then the solution to the two-way partitioning instance is trivially no. Otherwise, let <sup>E</sup>A = (1/2)- <sup>1</sup>≤i≤k <sup>a</sup>i. Notice that any deterministic mitigation strategy that achieves min-guess entropy larger than or equal to <sup>E</sup>A must have at most two clusters. On the other hand, the best min-guess entropy value can be achieved by having just a single cluster. To avoid this and force getting two clusters corresponding to the two partitions of a solution to the two-way partitions problem instance A, we introduce performance penalties such that merging more than k − 2 clusters is disallowed by keeping performance penalty <sup>π</sup>A(f,g) = 1 and performance overhead <sup>Δ</sup>A <sup>=</sup> <sup>k</sup> <sup>−</sup> 2. It is straightforward to verify that an instance of the resulting min-guess entropy problem has a yes answer if and only if the two-way partitioning instance does.

Since the deterministic Shannon mitigation problem is intractable, we design an approximate solution for the problem. Note that the problem is hard even if we only use existing functional observations for mitigation, i.e., G = F. Therefore, we consider this case for the approximate solution. Furthermore, we assume the following *sequential dominance* restriction on a deterministic policy μ: for f,g ∈ F if f ≺ g then either μ(f) ≺ g or μ(f) = μ(g). In other words, for any given f ≺ g, f can not be moved to a higher cluster than g without having g be moved to that cluster. For example, Fig. 4(a) shows Shannon mitigation problem with four functional observations and all possible mitigation policies (we represent <sup>μ</sup>(fi)(fj ) with <sup>μ</sup>(i, j)). Figure 4(b) satisfies the sequential dominance restriction, while Fig. 4(c) does not.

The search for the deterministic policies satisfying the sequential dominance restriction can be performed efficiently using dynamic programming by effective use of intermediate results' memorizations.

Algorithm (1) provides a pseudocode for the dynamic programming solution to find a deterministic mitigation policy satisfying the sequential dominance. The key idea is to start with considering policies that produce a single cluster for subclasses <sup>P</sup>i of the problem with the observation from <sup>f</sup>1,...,fi, and then compute policies producing one additional cluster in each step by utilizing the previously computed sub-problems and keeping track of the performance penalties. The algorithm terminates as soon as the solution of the current step respects the performance bound. The complexity of the algorithm is O(k<sup>3</sup>).

**Fig. 4.** (a). Example of Shannon mitigation problem with all possible mitigation policies for 4 classes of observations. (b,c) Two examples of the mitigation policies that results in 2 and 3 classes of observations.

#### **5.2 Stochastic Shannon Mitigation Algorithm**

Next, we solve the (stochastic) Shannon mitigation problem by posing it as an optimization problem. Consider the stochastic Shannon mitigation problem Shan<sup>E</sup> (F, <sup>G</sup> <sup>=</sup> <sup>F</sup>, <sup>S</sup><sup>F</sup> , π, E,Δ) with a stochastic policy <sup>μ</sup> : F→D(G) and

**Algorithm 1.** Approximate Deterministic Shannon Mitigation **Input**: The Shannon entropy problem ShanMGE(F, <sup>G</sup> <sup>=</sup> <sup>F</sup>, <sup>S</sup><sup>F</sup> , π, E,Δ) **Output**: The entropy table (T). **<sup>1</sup> for** i = 1 to k **do <sup>2</sup>** T(i, 1) = E( i j=1 Sj ) **<sup>3</sup> if** - <sup>1</sup>≤j≤i <sup>π</sup>(j, i)(Bj/B) <sup>≤</sup> <sup>Δ</sup> **then** <sup>Π</sup>(i, 1) = - <sup>1</sup>≤j≤i <sup>π</sup>(j, i)(Bj/B) **<sup>4</sup> else** Π(i, 1) = ∞ **<sup>5</sup> if** Π(k, 1) < ∞ **then return** T; **<sup>6</sup> for** r = 2 to k **do <sup>7</sup> for** i = 1 to k **do <sup>8</sup>** Ω(i, r) = {j : 1 ≤ j<i and Π(j, r −1) + - j<q≤i <sup>π</sup>(q, i)(Bq/B) <sup>≤</sup> <sup>Δ</sup>} **<sup>9</sup> if** Ω=∅ **then** T(i, r)= max j∈Ω(i,r) min T(j, r−1), E( i q=j+1 Sq) **<sup>10</sup> else** T(i, r)= − ∞ **<sup>11</sup>** Let j be the index that maximizes T(i, r) **<sup>12</sup> if** Ω = ∅ **then** Π(i, r) = Π(j, r − 1) + - j<q≤i <sup>π</sup>(q, i)(Bq/B) **<sup>13</sup> else** Π(i, r) = ∞ **<sup>14</sup> if** Π(k, r) < ∞ **then return** T; **<sup>15</sup> return** T;

S<sup>F</sup> = -<sup>S</sup>1, S2,...,Sk. The following program characterizes the optimization problem that solves the Shannon mitigation problem with stochastic policy.

Maximize E, subject to:

1. 0 <sup>≤</sup> <sup>μ</sup>(fi)(fj ) <sup>≤</sup> 1 for 1 <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>j</sup> <sup>≤</sup> <sup>k</sup> 2. - i≤j≤k <sup>μ</sup>(fi)(f<sup>j</sup> ) = 1 for all 1 <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>k</sup>. 3. k i=1 k j=i <sup>|</sup>Si| · <sup>μ</sup>(fi)(f<sup>j</sup> ) · <sup>π</sup>(fi, f<sup>j</sup> ) <sup>≤</sup> <sup>Δ</sup>. 4. <sup>C</sup>j <sup>=</sup> j i=1 <sup>|</sup>Si| · <sup>μ</sup>(fi)(f<sup>j</sup> ) for 1 <sup>≤</sup> <sup>j</sup> <sup>≤</sup> <sup>k</sup>.

Here, the objective function E is one of the following functions:

$$\begin{aligned} \text{1. Guessing Entropy } \mathcal{E}\_{GE} &= \sum\_{j=1}^{k} C\_j^2\\ \text{2. Min-Guesser Entropy } \mathcal{E}\_{MGE} &= \min\_{1 \le j \le k} \{ C\_j \mid C\_j > 0 \},\\ \text{3. Shannon Entropy } \mathcal{E}\_{SE} &= \sum\_{j=1}^{k} C\_j \cdot \log\_2(C\_j) \end{aligned}$$

The linear constraints for the problem are defined as the following. The condition (1) and (2) express that μ provides a probability distributions, condition (3) provides restrictions regarding the performance constraint, and the condition (4) is the entropy specific constraint. The objective function of the optimization problem is defined based on the entropy criteria from E. For the simplicity, we omit the constant terms from the objective function definitions. For the guessing entropy, the problem is an instance of linearly constrained quadratic optimization problem [33]. The problem with Shannon entropy is a non-linear optimization problem [12]. Finally, the optimization problem with min-guess entropy is an instance of mixed integer programming [32]. We evaluate the scalability of these solvers empirically in Sect. 6 and leave the exact complexity as an open problem. We show that the min-guess entropy objective function can be efficiently solved with the branch and bound algorithms [36]. Figure 4(b,c) show two instantiations of the mitigation policies that are possible for the stochastic mitigation.

#### **6 Implementation Details**

**A. Environmental Setups.** All timing measurements are conducted on an Intel NUC5i5RYH. We switch off JIT Compilation and run each experiment multiple times and use the mean running time. This helps to reduce the effects of environmental factors such as the Garbage Collections. All other analyses are conducted on an Intel i5-2.7 GHz machine.

**B. Implementation of Side Channel Discovery.** We use the technique presented in [45] for the side channel discovery. The technique applies the functional data analysis [38] to create B-spline basis and fit functions to the vector of timing observations for each secret value. Then, the technique applies the functional data clustering [21] to obtain K classes of observations. We use the number of secret values in a cluster as the class size metric and the L<sup>1</sup> distance norm between the clusters as the penalty function.

**C. Implementation of Mitigation Policy Algorithms.** For the stochastic optimization, we encode the Shannon entropy and guessing entropy with linear constraints in Scipy [22]. Since the objective functions are non-linear (for the Shannon entropy) and quadratic (for the guessing entropy), Scipy uses sequential least square programming (SLSQP) [34] to maximize the objectives. For the stochastic optimization with the min-guess entropy, we encode the problem in Gurobi [19] as a mixed-integer programming (MIP) problem [32]. Gurobi solves the problem efficiently with branch-and-bound algorithms [1]. We use Java to implement the dynamic programming.

**D. Implementation of Enforcement.** The enforcement of mitigation policy is implemented in two steps. *First*, we use the initial timing functions and characterize them with program internal properties such as basic block calls. To do so, we use the decision tree learning approach presented in [45]. The decision tree model characterizes each functional observations with properties of program internals. *Second*, given the policy of mitigation, we enforce the mitigation policy with a monitoring system implemented on top of the Javassist [15] library. The monitoring system uses the decision tree model and matches the properties enabled during an execution with the tree model (detection of the current cluster). Then, it adds extra delays, based on the mitigation policy, to the current execution-time and enforces the mitigation policy. Note that the dynamic monitoring can result in a few micro-second delays. For the programs with timing differences in the order of micro-seconds, we transform source code using the decision tree model. The transformation requires manual efforts to modify and compile the new program. But, it adds negligible delays.

**E. Micro-benchmark Results.** Our goal is to compare different mitigation methods in terms of their security and performance. We examine the computation time of our tool Schmit in calculating the mitigation policies. See appendix for the relationships between performance bounds and entropy measures.

*Applications*: Mod Exp applications [30] are instances of square-and-multiply modular exponentiation (R = y<sup>k</sup> mod n) used for secret key operations in RSA [39]. Branch and Loop series consist of 6 applications where each application has conditions over secret values and runs a linear loop over the public values. The running time of the applications depend on the slope of the linear loops determined by the secret input.

*Computation time comparisons*: Fig. 5 shows the computation time for Branch and Loop applications (the applications are ordered in x-axis based on the discovered number of observational classes). For the min-guess entropy, we observe that both stochastic and dynamic programming approaches are efficient and fast as shown in Fig. 5(a). For the Shannon and guessing entropies,



the dynamic programming is scalable, while the stochastic mitigation is computationally expensive beyond 60 classes of observations as shown in Fig. 5(b,c).

*Mitigation Algorithm Comparisons*: Table 1 shows micro-benchmark results that compare the four mitigation algorithms with the two program series. Double scheme mitigation technique [10] does not provide guarantees on the performance overhead, and we can see that it is increased by more than 75 times for mod exp 6. Double scheme method reduces the number of classes of observations. However, we observe that this mitigation has difficulty improving the min-guess entropy. Second, Bucketing algorithm [26] can guarantee the performance overhead, but it is not an effective method to improve the security of functional observations, see the examples mod exp 6 and Branch and Loop 6. Third, in the algorithms, Schmit guarantees the performance to be below a certain bound, while it results in the highest entropy values. In most cases, the stochastic optimization technique achieves the highest min-entropy value. Here, we show the results with min-guess entropy measure. Also, we have strong evidences to show that Schmit achieves higher Shannon and guessing entropies. For example, in B L 5, the initial Shannon entropy has improved from 2.72 to 6.62, 4.1, 7.56, and 7.28 for the double scheme, the bucketing, the stochastic, and the deterministic algorithms, respectively.

**Fig. 5.** Computation time for synthesizing mitigation policy over Branch and Loop applications. Computation time for min-guess entropy (a) takes only few seconds. Computation time for the Shannon entropy (b) and guessing entropy (c) are expensive using Stochastic optimization. We set time-out to be 10 hours.

#### **7 Case Study**

**Research Question.** Does Schmit scale well and improve the security of applications (entropy measures) within the given performance bounds?

**Methodology.** We use the deterministic and stochastic algorithms for mitigating the leaks. We show our results for the min-guess entropy, but other entropy measures can be applied as well. Since the task is to mitigate existing leakages, we assume that the secret and public inputs are given.

**Objects of Study.** We consider four real-world applications:

In the inset table, we show the basic characteristics of these benchmarks.


*GabFeed* is a chat server with 573 methods [4]. There is a side channel in the authentication part of the application where the application takes users' public keys and its own private key, and generating a common key [14]. The vulnerability leaks the number of set bits in the secret key. Initial functional observations are shown in Fig. 6a. There are 34 clusters and min-guess entropy is 1. We aim to maximize the min-guess entropy under the performance overhead of 50%.

*Jetty.* We mitigate the side channels in util.security package of Eclipse Jetty web server. The package has Credential class which had a timing side channel. This vulnerability was analyzed in [14] and fixed initially in [6]. Then, the developers noticed that the implementation in [6] can still leak information and fixed this issue with a new implementation in [5]. However, this new implementation is still leaking information [45]. We apply Schmit to mitigate this timing side channels. Initial functional observations is shown in Fig. 6d. There are 20 classes of observations and the initial min-guess entropy is 4.5. We aim to maximize the min-guess entropy under the performance overhead of 50%.

*Java Verbal Expressions* is a library with 61 methods that construct regular expressions [2]. There is a timing side channel in the library similar to password comparison vulnerability [3] if the library has secret inputs. In this case, starting from the initial character of a candidate expression, if the character matches with the regular expression, it slightly takes more time to respond the request than otherwise. This vulnerability can leak all the regular expressions. We consider regular expressions to have a maximum size of 9. There are 9 classes of observations and the initial min-guess entropy is 50.5. We aim to maximize the min-guess entropy under the performance overhead of 50%.

*Password Checker.* We consider the password matching example from loginBad program [9]. The password stored in the server is secret, and the user's guess is a public input. We consider 20 secret (lengths at most 6) and 2,620 public inputs. There are 6 different clusters, and the initial min-guess entropy is 1.

**Findings for GabFeed.** With the stochastic algorithm, Schmit calculates the mitigation policy that results in 4 clusters. This policy improves the min-guess entropy from 1 to 138.5 and adds an overhead of 42.8%. With deterministic algorithm, Schmit returns 3 clusters. The performance overhead is 49.7% and the min-guess entropy improves from 1 to 106. The user chooses the deterministic policy and enforces the mitigation. We apply CART decision tree learning and characterizes the classes of observations with GabFeed method calls as shown in

**Fig. 6.** Initial functional observations, decision tree, and the mitigated observations from left to right for Gabfeed, Jetty, and Verbal Expressions from top to bottom.

Fig. 6b. The monitoring system uses the decision tree model and automatically detects the current class of observation. Then, it adds extra delays based on the mitigation policy to enforce it. The results of the mitigation is shown in Fig. 6c. Answer for our research question. *Scalability*: It takes about 1 second to calculate the stochastic and the deterministic policies. *Security*: Stochastic and deterministic variants improve the min-guess entropy more than 100 times under the given performance overhead of 50%, respectively.

**Findings for Jetty.** The stochastic algorithm and the deterministic algorithm find the same policy that results in 1 cluster with 39.6% performance overhead. The min-guess entropy improves from 4.5 to 400.5. For the enforcement, Schmit first uses the initial clusterings and specifies their characteristics with program internals that result in the decision tree model shown in Fig. 6e. Since the response time is in the order of micro-seconds, we transform the source code using the decision tree model by adding extra counter variables. The results of the mitigation is shown in Fig. 6f. *Scalability*: It takes less than 1 second to calculate the policies for both algorithms. *Security*: Stochastic and deterministic variants improve the min-guess entropy 89 times under the given performance overhead.

**Findings for Java Verbal Expressions.** For the stochastic algorithm, the policy results in 2 clusters, and the min-guess entropy has improved to 500.5. The performance overhead is 36%. For the dynamic programming, the policy results in 2 clusters. This adds 28% of performance overhead, while it improves the min-guess entropy from 50.5 to 450.5. The user chooses to use the deterministic policy for the mitigation. For the mitigation, we transform the source code using the decision tree model and add the extra delays based on the mitigation policy.

**Findings for Password Matching.** Both the deterministic and the stochastic algorithms result in finding a policy with 2 clusters where the min-guess entropy has improved from 1 to 5.5 with the performance overhead of 19.6%. For the mitigation, we transform the source code using the decision tree model and add extra delays based on the mitigation policy if necessary.

#### **8 Related Work**

Quantitative theory of information have been widely used to measure how much information is being leaked with side-channel observations [11,20,25,41]. Mitigation techniques increase the remaining entropy of secret sets leaked through the side channels, while considering the performance [10,23,26,40,48,49].

K¨opf and D¨urmuth [26] use a bucketing algorithm to partition programs' observations into intervals. With the unknown-message threat model, K¨opf and D¨urmuth [26] propose a dynamic programming algorithm to find the optimal number of possible observations under a performance penalty. The works [10,48] introduce different black-box schemes to mitigate leaks. In particular, Askarov et al. [10] show the quantizing time techniques, which permit events to release at scheduled constant slots, have the worst case leakage if the slot is not filled with events. Instead, they introduce the double scheme method that has a schedule of predictions like the quantizing approach, but if the event source fails to deliver events at the predicted time, the failure results in generating a new schedule in which the interval between predictions is doubled. We compare our mitigation technique with both algorithms throughout this paper.

Elimination of timing side channels is a common technique to guarantee the confidentiality of software [7,17,27,30,31,46]. The work [46] aims to eliminate side channels using static analysis enhanced with various techniques to keep the performance overheads low without guaranteeing the amounts of overhead. In contrast, we use dynamic analysis and allow a small amount of information to leak, but we guarantee an upper-bound on the performance overhead.

Machine learning techniques have been used for explaining timing differences between traces [42–44]. Tizpaz-Niari et al. [44] consider performance issues in softwares. They also cluster execution times of programs and then explain what program properties distinguish the different functional clusters. We adopt their techniques for our security problem.

**Acknowledgements.** The authors would like to thank Mayur Naik for shepherding our paper and providing useful suggestions. This research was supported by DARPA under agreement FA8750-15-2-0096.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Property Directed Self Composition**

Ron Shemer1(B) , Arie Gurfinkel2, Sharon Shoham1, and Yakir Vizel3

> Tel Aviv University, Tel Aviv, Israel ronsheme@mail.tau.ac.il University of Waterloo, Waterloo, Canada The Technion, Haifa, Israel

**Abstract.** We address the problem of verifying k*-safety properties*: properties that refer to k interacting executions of a program. A prominent way to verify k-safety properties is by *self composition*. In this approach, the problem of checking k-safety over the original program is reduced to checking an "ordinary" safety property over a program that executes k copies of the original program in some order. The way in which the copies are composed determines how complicated it is to verify the composed program. We view this composition as provided by a *semantic self composition function* that maps each state of the composed program to the copies that make a move. Since the "quality" of a self composition function is measured by the ability to verify the safety of the composed program, we formulate the problem of inferring a self composition function together with the inductive invariant needed to verify safety of the composed program, where both are restricted to a given language. We develop a *property-directed* inference algorithm that, given a set of predicates, infers composition-invariant pairs expressed by Boolean combinations of the given predicates, or determines that no such pair exists. We implemented our algorithm and demonstrate that it is able to find self compositions that are beyond reach of existing tools.

#### **1 Introduction**

Many relational properties, such as noninterference [12], determinism [21], service level agreements [9], and more, can be reduced to the problem of k-safety. Namely, reasoning about k different traces of a program simultaneously. A common approach to verifying k-safety properties is by means of *self composition*, where the program is composed with k copies of itself [4,32]. A state of the composed program consists of the states of each copy, and a trace naturally corresponds to k traces of the original program. Therefore, k-safety properties of the original program become ordinary safety properties of the composition, hence reducing k-safety verification to ordinary safety. This enables reasoning about k-safety properties using any of the existing techniques for safety verification such as Hoare logic [20] or model checking [7].

While self composition is sound and complete for k-safety, its applicability is questionable for two main reasons: (i) considering several copies of the program greatly increases the state space; and (ii) the way in which the different copies are composed when reducing the problem to safety verification affects the complexity of the resulting self composed program, and as such affects the complexity of verifying it. Improving the applicability of self composition has been the topic of many c The Author(s) 2019

works [2,14,18,26,30,33]. However, most efforts are focused on compositions that are pre-defined, or only depend on syntactic similarities.

In this paper, we take a different approach; we build upon the observation that by choosing the "right" composition, the verification can be greatly simplified by leveraging "simple" correlations between the executions. To that end, we propose an algorithm, called PDSC, for inferring a *property directed* self composition. Our approach uses a *dynamic* composition, where the composition of the different copies can change during verification, directed at simplifying the verification of the composed program.

Compositions considered in previous work differ in the order in which the copies of the program execute: either synchronously, asynchronously, or in some mix of the two [3,14,34]. To allow general compositions, we define a *composition function* that maps every state of the composed program to the set of copies that are scheduled in the next step. This determines the order of execution for the different copies, and thus induces the self composed program. Unlike most previous works where the composition is pre-defined based on syntactic rules only, our composition is *semantic* as it is defined over the state of the composed program.

To capture the difficulty of verifying the composed program, we consider verification by means of inferring an inductive invariant, parameterized by a language for expressing the inductive invariant. Intuitively, the more expressive the language needs to be, the more difficult the verification task is. We then define the problem of inferring a composition function *together* with an inductive invariant for verifying the safety of the composed program, where both are restricted to a given language. Note that for a fixed language L, an inductive invariant may exist for some composition function but not for another1. Thus, the restriction to <sup>L</sup> defines a target for the inference algorithm, which is now directed at finding a composition that admits an inductive invariant in L.

*Example 1.* To demonstrate our approach, consider the program in Fig. 1. The program inserts a new value into an array. We assume that the array A and its length len are "low"-security variables, while the inserted value h is "high"-security. The first loop finds the location in which h will be inserted. Note that the number of iterations depends on the value of h. Due to that, the second loop executes to ensure that the output i (which corresponds to the number of iterations) does not leak sensitive data. As an example, we emphasize that without the second loop, i could leak the location of h in A. To express the property that i does not leak sensitive data, we use the 2-safety property that in any two executions, if the inputs A and len are the same, so is the output i.

To verify the 2-safety property, consider two copies of the program. Let the language L for verifying the self composition be defined by the predicates depicted in Fig. 1. The most natural self composition to consider is a lock-step composition, where the copies execute synchronously. However, for such a composition the composed program may reach a state where, for example, i<sup>1</sup> = i<sup>2</sup> + 1. This occurs when the first copy exists the first loop, while the second copy is still executing it. Since the language cannot express this correlation between the two copies, no inductive invariant suffices to verify that i<sup>1</sup> = i<sup>2</sup> when the program terminates.

<sup>1</sup> See the extended version [29] for an example that requires a non-linear inductive invariant with a composition that is based on the control structure but has a linear invariant with another.

**Fig. 1.** Constant-time insert to an array.

In contrast, when verifying the 2-safety property, PDSC directs its search towards a composition function for which an inductive invariant in L does exist. As such, it infers the composition function depicted in Fig. 1, as well as an inductive invariant in L. The invariant for this composition implies that i<sup>1</sup> = i<sup>2</sup> at every state.

As demonstrated by the example, PDSC focuses on logical languages based on predicate abstraction [17], where inductive invariants can be inferred by model checking. In order to infer a composition function that admits an inductive invariant in L, PDSC starts from a default composition function, and modifies its definition based on the reasoning performed by the model checker during verification. As the composition function is part of the verified model (recall that it is defined over the program state), different compositions are part of the state space explored by the model checker. As a result, a key ingredient of PDSC is identifying "bad" compositions that prevent it from finding an inductive invariant in L. It is important to note that a naive algorithm that tries all possible composition functions has a time complexity <sup>O</sup>(2<sup>2</sup>|P| ), where <sup>P</sup> is the set of predicates considered. However, integrating the search for a composition function into the model checking algorithm allows us to reduce the time complexity of the algorithm to 2*<sup>O</sup>*(|P|) , where we show that the problem is in fact PSPACE-hard.<sup>2</sup>

We implemented PDSC using SEAHORN [19], Z3 [25] and SPACER [22] and evaluated it on examples that demonstrate the need for nontrivial semantic compositions. Our results clearly show that PDSC can solve complex examples by inferring the required composition, while other tools cannot verify these examples. We emphasize that for these particular examples, lock-step composition is not sufficient. We also evaluated PDSC on the examples from [26,30] that are proven with the trivial lock-step composition. On these examples, PDSC is comparable to state of the art tools.

**Related Work.** This paper addresses the problem of verifying k-safety properties (also called hyperproperties [8]) by means of self composition. Other approaches tackle the problem without self-composition, and often focus on more specific properties, most noticeably the 2-safety noninterference property (e.g. [1,33]). Below we focus on works that use self-composition.

<sup>2</sup> Proofs of the claims made in this paper can be found in the extended version [29].

Previous work such as [2–4,14,15,32] considered self composition (also called product programs) where the composition function is constant and set a-priori, using syntax-based hints. While useful in general, such self compositions may sometimes result in programs that are too complex to verify. This is in contrast to our approach, where the composition function is evolving during verification, and is adapted to the capabilities of the model checker.

The work most closely related to ours is [30] which introduces Cartesian Hoare Logic (CHL) for verification of k-safety properties, and designs a verification framework for this logic. This work is further improved in [26]. These works search for a proof in CHL, and in doing so, implicitly modify the composition. Our work infers the composition explicitly and can use off-the-shelf model checking tools. More importantly, when loops are involved both [30] and [26] use lock-step composition and align loops syntactically. Our algorithm, in contrast, does not rely on syntactic similarities, and can handle loops that cannot be aligned trivially.

There have been several results in the context of harnessing Constraint Horn Clauses (CHC) solvers for verification of relational properties [11,24]. Given several copies of a CHC system, a product CHC system that synchronizes the different copies is created by a syntactical analysis of the rules in the CHC system. These works restrict the synchronization points to CHC predicates (i.e., program locations), and consider only one synchronization (obtained via transformations of the system of CHCs). On the other hand, our algorithm iteratively searches for a good synchronization (composition), and considers synchronizations that depend on program state.

*Equivalence Checking and Regression Verification.* Equivalence checking is another closely related research field, where a composition of several programs is considered. As an example, equivalence checking is applied to verify the correctness of compiler optimizations [10,18,28,34]. In [28] the composition is determined by a brute-force search for possible synchronization points. While this brute-force search resembles our approach for finding the correct composition, it is not guided by the verification process. The works in [10,18] identify possible synchronization points syntactically, and try to match them during the construction of a simulation relation between programs.

Regression verification also requires the ability to show equivalence between different versions of a program [15,16,31]. The problem of synchronizing unbalanced loops appears in [31] in the form of unbalanced recursive function calls. To allow synchronization in such cases, the user can specify different unrolling parameters for the different copies. In contrast, our approach relies only on user supplied predicates that are needed to establish correctness, while synchronization is handled automatically.

#### **2 Preliminaries**

In this paper we reason about programs by means of the transition systems defining their semantics. A transition system is a tuple T = (S, R, F), where S is a set of states, R ⊆ S×S is a transition relation that specifies the steps in an execution of the program, and F ⊆ S is a set of *terminal states* F ⊆ S such that every terminal state s ∈ F has an outgoing transition to itself and no additional transitions (terminal states allow us to reason about pre/post specifications of programs). An *execution* or *trace* π = s0, s1,... is a (finite or infinite) sequence of states such that for every i ≥ 0, (s*i*, s*i*+1) ∈ R. The execution is *terminating* if there exists 0 ≤ i ≤ |π| such that s*<sup>i</sup>* ∈ F. In this case, the suffix of the execution is of the form s*i*, s*i*,... and we say that π ends at s*i*.

As usual, we represent transition systems using logical formulas over a set of variables, corresponding to the program variables. We denote the set of variables by V. The set of terminal states is represented by a formula over V and the transition relation is represented by a formula over VV , where V represents the pre-state of a transition and V = {v | v ∈ V} represents its post-state. In the sequel, we use sets of states and their symbolic representation via formulas interchangeably.

*Safety and Inductive Invariants.* We consider safety properties defined via pre/post conditions.3 <sup>A</sup> *safety property* is a pair (*pre*, *post*) where *pre*, *post* are formulas over <sup>V</sup>, representing subsets of S, denoting the pre- and post-condition, respectively. T *satisfies* (*pre*, *post*), denoted T |= (*pre*, *post*), if every terminating execution π of T that starts in a state s<sup>0</sup> such that s<sup>0</sup> |= *pre* ends in a state s such that s |= *post*. In other words, for every state s that is reachable in T from a state in *pre* we have that s |= F → *post*.

A prominent way to verify safety properties is by finding an inductive invariant. An *inductive invariant* for a transition system T and a safety property (*pre*, *post*) is a formula *Inv* such that(1) *pre* ⇒ *Inv* (initiation), (2) *Inv* ∧ R ⇒ *Inv* (consecution), and (3) *Inv* ⇒ (F → *post*) (safety), where ϕ ⇒ ψ denotes the validity of ϕ → ψ, and ϕ denotes ϕ(V ), i.e., the formula obtained after substituting every v ∈ V by the corresponding v ∈ V. If there exists such an inductive invariant, then T |= (*pre*, *post*).

*k-safety.* A k*-safety property* refers to k interacting executions of T. Similarly to an ordinary property, it is defined by (*pre*, *post*), except that *pre* and *post* are defined over <sup>V</sup><sup>1</sup> ... V*<sup>k</sup>* where <sup>V</sup>*<sup>i</sup>* <sup>=</sup> {v*<sup>i</sup>* <sup>|</sup> <sup>v</sup> ∈ V} denotes the <sup>i</sup>th copy of the program variables. As such, *pre* and *post* represent sets of k-tuples of program states (k*-states* for short): for a <sup>k</sup>-tuple (s1,...,s*k*) of states and a formula <sup>ϕ</sup> over <sup>V</sup><sup>1</sup> ... V*<sup>k</sup>*, we say that (s1,...,s*k*) <sup>|</sup><sup>=</sup> <sup>ϕ</sup> if <sup>ϕ</sup> is satisfied when for each <sup>i</sup>, the assignment of <sup>V</sup>*<sup>i</sup>* is determined by <sup>s</sup>*i*. We say that <sup>T</sup> *satisfies* (*pre*, *post*), denoted <sup>T</sup> <sup>|</sup>=*<sup>k</sup>* (*pre*, *post*), if for every <sup>k</sup> terminating executions π<sup>1</sup>,...,π*<sup>k</sup>* of T that start in states s1,...,s*k*, respectively, such that (s1,...,s*k*) |= *pre*, it holds that they end in states t1,...,t*k*, respectively, such that (t1,...,t*k*) |= *post*.

For example, the *non interference* property may be specified by the following 2 safety property: *pre* = - *<sup>v</sup>*∈LowIn <sup>v</sup><sup>1</sup> <sup>=</sup> <sup>v</sup><sup>2</sup>, *post* <sup>=</sup> - *<sup>v</sup>*∈LowOut <sup>v</sup><sup>1</sup> <sup>=</sup> <sup>v</sup><sup>2</sup> where LowIn and LowOut denote subsets of the program inputs, resp. outputs, that are considered "low security" and the rest are classified as "high security". This property asserts that every 2 terminating executions that start in states that agree on the "low security" inputs end in states that agree on the low security outputs, i.e., the outcome does not depend on any "high security" input and, hence, does not leak secure information.

Checking k-safety properties reduces to checking ordinary safety properties by creating a *self composed program* that consists of k copies of the transition system, each

<sup>3</sup> Our results can be extended to arbitrary safety (and k-safety) properties by introducing "observable" states to which the property may refer.

with its own copy of the variables, that run in parallel in some way. Thus, the self composed program is defined over variables <sup>V</sup>*<sup>k</sup>* <sup>=</sup> <sup>V</sup>1...V*k*, where <sup>V</sup>*<sup>i</sup>* <sup>=</sup> {v*<sup>i</sup>* <sup>|</sup> <sup>v</sup> ∈ V} denotes the variables associated with the ith copy. For example, a common composition is a *lock-step* composition in which the copies execute simultaneously. The resulting composed transition system T *<sup>k</sup>* = (S*k*, R*k*, F*k*) is defined such that <sup>S</sup>*<sup>k</sup>* <sup>=</sup> <sup>S</sup> <sup>×</sup> ... <sup>×</sup> <sup>S</sup>, <sup>F</sup>*<sup>k</sup>* <sup>=</sup> *k <sup>i</sup>*=1 <sup>F</sup>(V*<sup>i</sup>* ) and R*<sup>k</sup>* = *k <sup>i</sup>*=1 <sup>R</sup>(V*<sup>j</sup>* , <sup>V</sup>*<sup>j</sup>* ). Note that <sup>R</sup>*<sup>k</sup>* is defined over <sup>V</sup>*<sup>k</sup>* V*k* (as usual). Then, the k-safety property (*pre*, *post*) is satisfied by T if and only if an ordinary safety property (*pre*, *post*) is satisfied by T *k*. More general notions of *self composition* are investigated in Sect. 3.

#### **3 Inferring Self Compositions for Restricted Languages of Inductive Invariants**

Any self-composition is sufficient for reducing k-safety to safety, e.g., lockstep, sequential, synchronous, asynchronous, etc. However, the choice of the selfcomposition used determines the difficulty of the resulting safety problem. Different self composed programs would require different inductive invariants, some of which cannot be expressed in a given logical language.

In this section, we formulate the problem of inferring a self composition function such that the obtained self composed program may be verified with a given language of inductive invariants. We are, therefore, interested in inferring both the self composition function and the inductive invariant for verifying the resulting self composed program. We start by formulating the kind of self compositions that we consider.

In the sequel, we fix a transition system T = (S, R, F) with a set of variables V.

#### **3.1 Semantic Self Composition**

Roughly speaking, a k self composition of T consists of k copies of T that execute together in some order, where steps may interleave or be performed simultaneously. The order is determined by a self composition function, which may also be viewed as a scheduler that is responsible for scheduling a subset of the copies in each step. We consider *semantic* compositions in which the order may depend on the *states* of the different copies, as well as the correlations between them (as opposed to *syntactic* compositions that only depend on the control locations of the copies, but may not depend on the values of other variables):

**Definition 1 (Semantic Self Composition Function).** *A* semantic k self composition function *(*k*-composition function for short) is a function* <sup>f</sup> : <sup>S</sup>*<sup>k</sup>* <sup>→</sup> <sup>P</sup>({1..k})*, mapping each* k*-state to a* nonempty *set of copies that are to participate in the next step of the self composed program*<sup>4</sup>*.*

<sup>4</sup> We consider *memoryless* composition functions. Compositions that depend on the history of the (joint) execution are supported via ghost state added to the program to track the history.

We represent a k-composition function f by a set of logical conditions, with a condition C*<sup>M</sup>* for every nonempty subset M ⊆ {1..k} of the copies. For each such <sup>M</sup> ⊆ {1..k}, the condition <sup>C</sup>*<sup>M</sup>* is defined over <sup>V</sup>*<sup>k</sup>* <sup>=</sup> <sup>V</sup><sup>1</sup> ... V*k*, and hence it represents a set of k-states, with the meaning that all the k-states that satisfy C*<sup>M</sup>* are mapped to M by f:

$$f(s\_1, \ldots, s\_k) = M \text{ if and only if } (s\_1, \ldots, s\_k) \mid = C\_M.$$

To ensure that the function is well defined, we require that ( *<sup>M</sup>* C*M*) ≡ *true*, which ensures that every k-state satisfies at least one of the conditions. We also require that for every M<sup>1</sup> = M2, C*<sup>M</sup>*<sup>1</sup> ∧ C*<sup>M</sup>*<sup>2</sup> ≡ *false*, hence every k-state satisfies at most one condition. Together these requirements ensure that the conditions induce a partition of the set of all k-states. In the sequel, we identify a k-composition function f with its symbolic representation via conditions {C*M*}*<sup>M</sup>* and use them interchangeably.

**Definition 2 (Composed Program).** *Given a* k*-composition function* f*, represented via conditions* C*<sup>M</sup> for every nonempty set* M ⊆ {1..k}*, we define the* k self composition *of* <sup>T</sup> *to be the transition system* <sup>T</sup>*<sup>f</sup>* = (S*<sup>k</sup>*, R*<sup>f</sup>* , F*<sup>k</sup>*) *over variables* <sup>V</sup>*<sup>k</sup>* <sup>=</sup> <sup>V</sup><sup>1</sup> ... <sup>V</sup>*<sup>k</sup> defined as follows:* <sup>F</sup>*<sup>k</sup>* <sup>=</sup> *k <sup>i</sup>*=1 F*<sup>i</sup> , where* <sup>F</sup>*<sup>i</sup>* <sup>=</sup> <sup>F</sup>(V*<sup>i</sup>* )*, and*

$$R^f = \bigvee\_{\emptyset \neq M \subseteq \{1\ldots k\}} (C\_M \wedge \varphi\_M) \quad \text{where} \quad \varphi\_M = \bigwedge\_{j \in M} R(\mathcal{V}^j, \mathcal{V}^{j'}) \wedge \bigwedge\_{j \notin M} \mathcal{V}^j = \mathcal{V}^{j'}$$

Thus, in <sup>T</sup>*<sup>f</sup>* , the set of states consists of <sup>k</sup>-states (S*<sup>k</sup>* <sup>=</sup> <sup>S</sup> <sup>×</sup> ... <sup>×</sup> <sup>S</sup>), the terminal states are k-states in which all the individual states are terminal, and the transition relation includes a transition from (s1,...,s*k*) to (s 1,...,s *<sup>k</sup>*) if and only if f(s1,...,s*k*) = M and (∀i ∈ M. (s*i*, s *<sup>i</sup>*) ∈ R) ∧ (∀i ∈ M. s*<sup>i</sup>* = s *<sup>i</sup>*). That is, every transition of T*<sup>f</sup>* corresponds to a simultaneous transition of a subset M of the k copies of T, where the subset is determined by the self composition function f. If f(s1,...,s*k*) = M, then for every i ∈ M we say that i is *scheduled* in (s1,...,s*k*).

*Example 2.* A k self composition that runs the k copies of T sequentially, one after the other, corresponds to a k-composition function f defined by f(s1,...,s*k*) = {i} where i ∈ {1..k} is the minimal index of a non-terminal state in {s1,...,s*k*}. If all states in {s1,...,s*k*} are terminal then i = k (or any other index). This is encoded as follows: for every <sup>1</sup> <sup>≤</sup> i<k, <sup>C</sup>{*i*} <sup>=</sup> <sup>¬</sup>F*<sup>i</sup>* <sup>∧</sup> - *j<i* <sup>F</sup>*<sup>j</sup>* , <sup>C</sup>{*k*} <sup>=</sup> - *j<k* F*<sup>j</sup>* and C*<sup>M</sup>* = *false* for every other M ⊆ {1..k}.

*Example 3.* The lock-step composition that runs the k copies of T synchronously corresponds to a k-self composition function f defined by f(s1,...,s*k*) = {1,...,k}, and encoded by C{1*,...,k*} = *true* and C*<sup>M</sup>* = *false* for every other M ⊆ {1..k}.

In order to ensure soundness of a reduction of k-safety to safety via self composition, one has to require that the self composition function does not "starve" any copy of the transition system that is about to terminate if it continues to execute. We refer to this requirement as *fairness*.

**Definition 3 (Fairness).** *A* k*-self composition function* f *is* fair *if for every* k *terminating executions* π1,...,π*<sup>k</sup> of* T *there exists an execution* π *of* T*<sup>f</sup> such that for every copy* <sup>i</sup> ∈ {1..k}*, the projection of* <sup>π</sup> *to* <sup>i</sup> *is* <sup>π</sup>*<sup>i</sup> .*

Note that by the definition of the terminal states of T*<sup>f</sup>* , π as above is guaranteed to be terminating. We say that the ith copy *terminates* in π if π contains a k-state (s1,...,s*k*) such that s*<sup>i</sup>* ∈ F. Fairness may be enforced in a straightforward way by requiring that whenever f(s1,...,s*k*) = M, the set M includes no index i for which s*<sup>i</sup>* ∈ F, unless all have terminated. Since we assume that terminal states may only transition to themselves, a weaker requirement that suffices to ensure fairness is that M includes at least one index i for which s*<sup>i</sup>* ∈ F, unless there is no such index.

The following claim is now straightforward:

**Lemma 1.** *Let* T *be a transition system,* (*pre*, *post*) *a* k*-safety property, and* f *a fair* k*-composition function for* T *and* (*pre*, *post*)*. Then*

$$T \vdash^k (pre, post) \text{ if } \; T^f \vdash (pre, post).$$

*Proof (sketch).* Every terminating execution of T*<sup>f</sup>* corresponds to k terminating executions of T. Fairness of f ensures that the converse also holds.

To demonstrate the necessity of the fairness requirement, consider a (non-fair) self composition function f that maps every state to {1}. Then, regardless of what the actual transition system T does, the resulting self composition T*<sup>f</sup>* satisfies every pre-post specification vacuously, as it never reaches a terminal state.

*Remark 1.* While we require the conditions {C*M*}*<sup>M</sup>* defining a self composition function f to induce a partition of S*<sup>k</sup>* in order to ensure that f is well defined as a (total) function, the requirement may be relaxed in two ways. First, we may allow C*<sup>M</sup>*<sup>1</sup> and C*<sup>M</sup>*<sup>2</sup> to overlap. This will add more transitions and may make the task of verifying the composed program more difficult, but it maintains the soundness of the reduction. Second, it suffices that the conditions cover the set of *reachable states* of the composed program rather than the entire state space. These relaxations do not damage soundness. Technically, this means that f represented by the conditions is a relation rather than a function. We still refer to it as a function and write f(s1,...,s*k*) = M to indicate that (s1,...,s*k*) |= C*M*, not excluding the possibility that (s1,...,s*k*) |= M for M = M as well. We note that as long as the language used to describe compositions is closed under Boolean operations, we can always extract from the conditions {C*M*}*<sup>M</sup>* a function f . This is done as follows: First, to prevent the overlap between conditions, determine an arbitrary total order < on the sets M ⊆ {1..k} and set C *<sup>M</sup>* := C*<sup>M</sup>* ∧ - *N<M* ¬C*<sup>N</sup>* . Second, to ensure that the conditions cover the entire state space, set C {1*..k*} := <sup>C</sup> {1*..k*} ∨ ¬( *<sup>M</sup>* C*M*). It is easy to verify that f defined by {C *<sup>M</sup>*}*<sup>M</sup>* is a total self composition function and that if f is fair, then so is f .

#### **3.2 The Problem of Inferring Self Composition with Inductive Invariant**

Lemma 1 states the soundness of the reduction of k-safety to ordinary safety. Together with the ability to verify safety by means of an inductive invariant, this leads to a verification procedure. However, while soundness of the reduction holds for *any* self composition, an inductive invariant in a given language may exist for the composed program resulting from some compositions but not from others. We therefore consider the self composition function and the inductive invariant together, as a pair, leading to the following definition.

**Definition 4.** *Let* T *be a transition system and* (*pre*, *post*) *a* k *safety property. For a formula Inv over* <sup>V</sup>*<sup>k</sup> and a self composition function* <sup>f</sup> *represented by conditions* {C*M*}*M, we say that* (f,*Inv*) *is a* composition-invariant *pair for* T *and* (*pre*, *post*) *if the following conditions hold:*


As commented in Remark 1, we relax the requirement that ( *<sup>M</sup>* C*M*) ≡ *true* to *Inv* =⇒ *<sup>M</sup>* C*M*, thus ensuring that the conditions cover all the reachable states. Since the reachable states of <sup>T</sup>*<sup>f</sup>* are determined by {C*M*}*<sup>M</sup>* (which define <sup>f</sup>), this reveals the interplay between the self composition function and the inductive invariant. Furthermore, we do not require that C*<sup>M</sup>*<sup>1</sup> ∧ C*<sup>M</sup>*<sup>2</sup> ≡ *false* for M<sup>1</sup> = M2, hence a k-state may satisfy multiple conditions. As explained earlier, these relaxations do not damage soundness. Furthermore, if we construct from f a self composition function f as described in Remark 1, *Inv* would be an inductive invariant for T*<sup>f</sup>*- as well.

**Lemma 2.** *If there exists a composition-invariant pair* (f,*Inv*) *for* T *and* (*pre*, *post*)*, then* <sup>T</sup> <sup>|</sup>=*<sup>k</sup>* (*pre*, *post*)*.*

If we do not restrict the language in which f and *Inv* are specified, then the converse also holds. However, in the sequel we are interested in the ability to verify k-safety with a given language, e.g., one for which the conditions of Definition 4 belong to a decidable fragment of logic and hence can be discharged automatically.

**Definition 5 (Inference in** L**).** *Let* L *be a logical language. The problem of inferring a composition-invariant pair in* L *is defined as follows. The input is a transition system* T *and a* k*-safety property* (*pre*, *post*)*. The output is a composition-invariant pair* (f,*Inv*) *for* T *and* (*pre*, *post*) *(as defined in Definition 4), where Inv* ∈ L *and* f *is represented by conditions* {C*M*}*<sup>M</sup> such that* C*<sup>M</sup>* ∈ L *for every* ∅ = M ⊆ {1..k}*. If no such pair exists, the output is "no solution".*

When no solution exists, it does not necessarily mean that <sup>T</sup> <sup>|</sup>=*<sup>k</sup>* (*pre*, *post*). Instead, it may be that the language L is simply not expressive enough. Unfortunately, for expressive languages (e.g., quantified formulas or even quantifier free linear integer arithmetic), the problem of inferring an inductive invariant alone is already undecidable, making the problem of inferring a composition-invariant pair undecidable as well:

**Lemma 3.** *Let* L *be closed under Boolean operations and under substitution of a variable with a value, and include equalities of the form* v = a*, where* v *is a variable and* a *is a value (of the same sort). If the problem of inferring an inductive invariant in* L *is undecidable, then so is the problem of inferring a composition-invariant pair in* L*.*

For example, linear integer arithmetic satisfies the conditions of the lemma. This motivates us to restrict the languages of inductive invariants. Specifically, we consider languages defined by a finite set of predicates. We consider *relational* predicates, defined over <sup>V</sup>*<sup>k</sup>* <sup>=</sup> <sup>V</sup><sup>1</sup> ... V*k*. For a finite set of predicates <sup>P</sup>, we define <sup>L</sup><sup>P</sup> to be the set of all formulas obtained by Boolean combinations of the predicates in P.

**Definition 6 (Inference using predicate abstraction).** *The problem of inferring a predicate-based composition-invariant pair is defined as follows. The input is a transition system* T*, a* k*-safety property* (*pre*, *post*)*, and a finite set of predicates* P*. The output is the solution to the problem of inferring a composition-invariant pair for* T *and* (*pre*, *post*) *in* L<sup>P</sup> *.*

*Remark 2.* It is possible to decouple the language used for expressing the self composition function from the language used to express the inductive invariant. Clearly, different sets of predicates (and hence languages) can be assigned to the self composition function and to the inductive invariant. However, since inductiveness is defined with respect to the transitions of the composed system, which are in turn defined by the self composition function, if the language defining f is not included in the language defining *Inv*, the conditions C*<sup>M</sup>* themselves would be over-approximated when checking the requirements of Definition 4 and therefore would incur a precision loss. For this reason, we use the same language for both.

Since the problem of invariant inference in L<sup>P</sup> is PSPACE-hard [23], a reduction from the problem of inferring inductive invariants to the problem of inferring composition-invariant pairs (similar to the one used in the proof of Lemma 3) shows that composition-invariant inference in L<sup>P</sup> is also PSPACE-hard:

**Theorem 1.** *Inferring a predicate-based composition-invariant pair is PSPACE-hard.*

#### **4 Algorithm for Inferring Composition-Invariant Pairs**

In this section, we present Property Directed Self-Composition, PDSC for short—our algorithm for tackling the composition-invariant inference problem for languages of predicates (Definition 6). Namely, given a transition system T, a k-safety property (*pre*, *post*) and a finite set of predicates P, we address the problem of finding a pair (f,*Inv*), where f is a self composition function and *Inv* is an inductive invariant for the composed transition system <sup>T</sup>*<sup>f</sup>* obtained from <sup>f</sup>, and both of them are in <sup>L</sup><sup>P</sup> , i.e., defined by Boolean combinations of the predicates in P.

We rely on the property that a transition system (in our case T*<sup>f</sup>* ) has an inductive invariant in L<sup>P</sup> if and only if its abstraction obtained using P is safe. This is because, the set of reachable abstract states is the strongest set expressible in L<sup>P</sup> that satisfies initiation and consecution. Given T*<sup>f</sup>* , this allows us to use predicate abstraction to either obtain an inductive invariant in <sup>L</sup><sup>P</sup> for <sup>T</sup>*<sup>f</sup>* (if the abstraction of <sup>T</sup>*<sup>f</sup>* is safe) or determine that no such inductive invariant exists (if an abstract counterexample trace is obtained). The latter indicates that a different self composition function needs to be considered. A naive realization of this idea gives rise to an iterative algorithm that starts from an

```
1 f ← lockstep , E ← ∅, Unreach ← false
2 while (true) do
3 (res,Inv, cex ) ← Abs Reach(P, T f , pre, post, Unreach)
4 if res = safe then return (f,Inv(P))
5 (ˆs, M) ← Last Step(cex )
6 E ← E ∪ {(ˆs, M)}
7 while (All Excluded Or Starving(ˆs, E)) do
8 Unreach ← Unreach ∨ sˆ
9 if Unreach ∧ ϕpre(B) ≡ false then return "no solution in LP "
10 cex ← Remove Last Step(cex )
11 (ˆs, M) ← Last Step(cex )
12 E ← E ∪ {(ˆs, M)}
13 f ← Modify SC(f, s, Eˆ )
```
**Algorithm 1.** PDSC: Property-Directed Self-Composition.

arbitrary initial composition function and in each iteration computes a new composition function. At the worst case such an algorithm enumerates all self composition functions defined in <sup>L</sup><sup>P</sup> , i.e., has time complexity <sup>O</sup>(2<sup>2</sup>|P| ). Importantly, we observe that, when no inductive invariant exists for some composition function, we can use the abstract counterexample trace returned in this case to (i) generalize and eliminate multiple composition functions, and (ii) identify that some abstract states must be unreachable if there is to be a composition-invariant pair, i.e., we "block" states in the spirit of *property directed reachability* [5,13]. This leads to the algorithm depicted in Algorithm 1 whose worst case time complexity is 2*<sup>O</sup>*(|P|). Next, we explain the algorithm in detail.

**Finding an Inductive Invariant for a Given Composition Function Using Predicate Abstraction.** We use predicate abstraction [17,27] to check if a given candidate composition function has a corresponding inductive invariant. This is done as follows. The abstraction of <sup>T</sup>*<sup>f</sup>* using <sup>P</sup>, denoted <sup>A</sup><sup>P</sup> (T*<sup>f</sup>* ), is a transition system (S, <sup>ˆ</sup> <sup>R</sup>ˆ) defined over variables <sup>B</sup>, where <sup>B</sup> <sup>=</sup> {b*<sup>p</sup>* <sup>|</sup> <sup>p</sup> ∈ P} (we omit the terminal states). <sup>S</sup><sup>ˆ</sup> <sup>=</sup> {0, <sup>1</sup>}<sup>B</sup>, i.e., each abstract state corresponds to a valuation of the Boolean variables representing P. An abstract state <sup>s</sup><sup>ˆ</sup> <sup>∈</sup> <sup>S</sup><sup>ˆ</sup> represents the following set of states of <sup>T</sup>*<sup>f</sup>* :

$$\gamma(\hat{s}) = \{ s^{\parallel} \in S^{\parallel k} \mid \forall p \in \mathcal{P}. \, s^{\parallel} \vdash p \Leftrightarrow \hat{s}(b\_p) = 1 \}$$

We extend γ to sets of states and to formulas representing sets of states in the usual way. The abstract transition relation is defined as usual:

$$\hat{R} = \{ (\hat{s}\_1, \hat{s}\_2) \mid \exists s^{\parallel} \_1 \in \gamma(\hat{s}\_1) \; \exists s^{\parallel} \_2 \in \gamma(\hat{s}\_2) . (s^{\parallel} \_1, s^{\parallel} \_2) \in R^f \} $$

Note that the set of abstract states in A<sup>P</sup> (T*<sup>f</sup>* ) does *not* depend on f.

*Notation.* We sometimes refer to an abstract state <sup>s</sup><sup>ˆ</sup> <sup>∈</sup> <sup>S</sup><sup>ˆ</sup> as the formula - *<sup>s</sup>*ˆ(*bp*)=1 b*<sup>p</sup>* ∧ - *<sup>s</sup>*ˆ(*bp*)=0 ¬b*p*. For a formula ψ ∈ L<sup>P</sup> , we denote by ψ(B) the result of substituting each p ∈ P in ψ by the corresponding Boolean variable b*p*. For the opposite direction, given a formula ψ over B, we denote by ψ(P) the formula in L<sup>P</sup> resulting from substituting each b*<sup>p</sup>* ∈ B in ψ by p. Therefore, ψ(P) is a symbolic representation of γ(ψ).

Every set defined by a formula ψ ∈ L<sup>P</sup> is precisely represented by ψ(B)in the sense that γ(ψ(B)) is equal to the set of states defined by ψ, i.e., ψ(B) is a precise abstraction of ψ. For simplicity, we assume that the termination conditions as well as the pre/post specification can be expressed precisely using the abstraction, in the following sense:

**Definition 7.** P *is* adequate *for* T *and* (*pre*, *post*) *if there exist* ϕ*pre*, ϕ*post*, ϕ*<sup>F</sup> <sup>i</sup>* ∈ L<sup>P</sup> *such that* <sup>ϕ</sup>*pre* <sup>≡</sup> *pre,* <sup>ϕ</sup>*post* <sup>≡</sup> *post and* <sup>ϕ</sup>*<sup>F</sup> <sup>i</sup>* <sup>≡</sup> <sup>F</sup>*<sup>i</sup> (for every copy* <sup>i</sup> ∈ {1..k}*).*

The following lemma provides the foundation for our algorithm:

**Lemma 4.** *Let* T *be a transition system,* (*pre*, *post*) *a* k *safety property, and* P *a finite set of predicates adequate for* T *and* (*pre*, *post*)*. For a self composition function* f *defined via conditions* {C*M*}*<sup>M</sup> in* L<sup>P</sup> *, there exists an inductive invariant Inv in* L<sup>P</sup> *such that* (f,*Inv*) *is a composition-invariant pair for* T *and* (*pre*, *post*) *if and only if the following three conditions hold:*

**S1** *All reachable states of* <sup>A</sup><sup>P</sup> (T*<sup>f</sup>* ) *from* <sup>ϕ</sup>*pre*(B) *satisfy* ( *k <sup>i</sup>*=1 ϕ*<sup>F</sup> <sup>i</sup>* (B)) → ϕ*post*(B)*,* **S2** *All reachable states of* <sup>A</sup><sup>P</sup> (T*<sup>f</sup>* ) *from* <sup>ϕ</sup>*pre*(B) *satisfy <sup>M</sup>* C*M*(B)*, and* **S3** *For every* ∅ = M ⊆ {1..k}*,* C*M*(B) ∧ ( *k <sup>j</sup>*=1 ¬ϕ*<sup>F</sup> <sup>j</sup>* (B)) =⇒ *<sup>j</sup>*∈*<sup>M</sup>* <sup>¬</sup>ϕ*<sup>F</sup> <sup>j</sup>* (B)*.*

*Furthermore, if the conditions hold, then the symbolic representation of the set of abstract states of* <sup>A</sup><sup>P</sup> (T*<sup>f</sup>* ) *reachable from* <sup>ϕ</sup>*pre*(B) *is a formula Inv over* <sup>B</sup> *such that* (f,*Inv*(P)) *is a composition-invariant pair for* T *and* (*pre*, *post*)*.*

Algorithm 1 starts from the lock-step self composition function (Line 1), which is fair5, and constructs the next candidate f such that condition **S3** in Lemma 4 always holds (see discussion of Modify SC). Thus, condition **S3** need not be checked explicitly.

Algorithm 1 checks whether conditions **S1** and **S2** hold for a given candidate composition function f by calling Abs Reach (Line.3) – both checks are performed via a (non-)reachability check in A<sup>P</sup> (T*<sup>f</sup>* ), checking whether a state violating ( *k <sup>i</sup>*=1 ϕ*<sup>F</sup> <sup>i</sup>* (B)) → ϕ*post*(B) or *<sup>M</sup>* C*M*(B) is reachable from ϕ*pre*(B). Algorithm 1 maintains the abstract states that are not in *<sup>M</sup>* C*M*(B) by the formula *Unreach* defined over B, which is initialized to *false* (as the lock-step composition function is defined for every state) and is updated in each iteration of Algorithm 1 to include the abstract states violating *<sup>M</sup>* C*M*(B). If no abstract state violating **S1** or **S2** is reachable, i.e., the conditions hold, then Abs Reach returns the (potentially overapproximated) set of reachable abstract states, represented by a formula *Inv* over B. In this case, by Lemma 4, (f,*Inv*(P)) is a composition-invariant pair (line 4). Otherwise, an abstract counterexample trace is obtained. (We can of course apply bounded model checking to check if the counterexample is real; we omit this check as our focus is on the case where the system is safe.)

*Remark 3.* In practice, we do not construct A<sup>P</sup> (T*<sup>f</sup>* ) explicitly. Instead, we use the *implicit predicate abstraction* approach [6].

<sup>5</sup> Any fair self composition can be chosen as the initial one; we chose lock-step since it is a good starting point in many applications.

**Eliminating Self Composition Candidates Based on Abstract Counterexamples.** An abstract counterexample to conditions **S1** or **S2** indicates that the candidate composition function f has no corresponding *Inv*. Violation of **S1** can only be resolved by changing f such that the abstract trace is no longer feasible. Violation of **S2** may, in principle, also be resolved by extending the definition of f such that it is defined for all the abstract states in the counterexample trace.

However, to prevent the need to explore both options, our algorithm maintains the following invariant for every candidate self composition function f that it constructs:

*Claim.* Every abstract state that is *not* in *<sup>M</sup>* C*M*(B) is not reachable w.r.t. the abstract composed program of *any* composition function that is part of a composition-invariant pair for T and (*pre*, *post*).

This property clearly holds for the lock-step composition function, which the algorithm starts with, since for this composition, *<sup>M</sup>* C*M*(B) ≡ *true*. As we explain in Corollary 2, it continues to hold throughout the algorithm.

As a result of this property, whenever a candidate composition function f does not satisfy condition **S1** or **S2**, it is never the case that *<sup>M</sup>* C*M*(B) needs to be extended to allow the abstract states in *cex* to be reachable. Instead, the abstract counterexample obtained in violation of the conditions needs to be eliminated by modifying f.

Let *cex* = ˆs1,..., <sup>s</sup>ˆ*m*+1 be an abstract counterexample of <sup>A</sup><sup>P</sup> (T*<sup>f</sup>* ) such that <sup>s</sup>ˆ<sup>1</sup> <sup>|</sup><sup>=</sup> ϕ*pre*(B) and sˆ*m*+1 |= (*k <sup>i</sup>*=1 ϕ*<sup>F</sup> <sup>i</sup>* (B)) ∧ ¬ϕ*post*(B) (violating **S1**) or sˆ*m*+1 |= *Unreach* (violating **S2**). Any self composition f that agrees with f on the states in γ(ˆs*i*)for every sˆ*<sup>i</sup>* that appears in *cex* has the same transitions in R*<sup>f</sup>* and, hence, the same transitions in <sup>R</sup>ˆ. It, therefore, exhibits the same abstract counterexample in <sup>A</sup><sup>P</sup> (T*<sup>f</sup>*- ). Hence, it violates **S1** or **S2** and is not part of any composition-invariant pair.

*Notation.* Recall that f is defined via conditions C*<sup>M</sup>* ∈ L<sup>P</sup> . This ensures that for every abstract state sˆ, f is defined in the same way for all the states in γ(ˆs). We denote the value of f on the states in γ(ˆs) by f(ˆs) (in particular, f(ˆs) may be undefined). We get that f(ˆs) = M if and only if sˆ |= C*M*(B).

Using this notation, to eliminate the abstract counterexample *cex* , one needs to eliminate at least one of the transitions in *cex* by changing the definition of f(ˆs*i*) for *some* 1 ≤ i ≤ m. For a new candidate function f this may be encoded by the disjunctive constraint *<sup>m</sup> <sup>i</sup>*=1 f (ˆs*i*) = f(ˆs*i*). However, we observe that a stronger requirement may be derived from *cex* based on the following lemma:

**Lemma 5.** *Let* f *be a self composition function and cex* = ˆs1,..., sˆ*m*+1 *a counterexample trace in* <sup>A</sup><sup>P</sup> (T*<sup>f</sup>* ) *such that* <sup>s</sup>ˆ<sup>1</sup> <sup>|</sup><sup>=</sup> <sup>ϕ</sup>*pre*(B) *but* <sup>s</sup>ˆ*m*+1 <sup>|</sup>= (*k <sup>i</sup>*=1 ϕ*<sup>F</sup> <sup>i</sup>* (B)) ∧ ¬ϕ*post*(B) *or* sˆ*m*+1 |= *Unreach. Then for any self composition function* f *such that* f (ˆs*m*) = f(ˆs*m*)*, if* sˆ*<sup>m</sup> is reachable in* A<sup>P</sup> (T*<sup>f</sup>*- ) *from* ϕ*pre*(B)*, then a counterexample trace to* **S1** *or* **S2** *exists.*

**Corollary 1.** *If there exists a composition-invariant pair* (f ,*Inv* )*, then there is also one where* f (ˆs*m*) = f(ˆs*m*)*.*

Therefore, we require that in the next self composition candidates the abstract state sˆ*<sup>m</sup>* must not be mapped to its current value in f, i.e., f (ˆs*m*) <sup>=</sup> <sup>M</sup>, where <sup>f</sup>(ˆs*m*) = <sup>M</sup>6.

Algorithm 1 accumulates these constraints in the set E (Line 6). Formally, the constraint (ˆs, M) ∈ E asserts that C *<sup>M</sup>* must imply ¬( - *<sup>s</sup>*ˆ(*bp*)=1 p∧ - *<sup>s</sup>*ˆ(*bp*)=0 ¬p), and hence f (ˆs) = M.

**Identifying Abstract States that Must Be Unreachable.** A new candidate self composition is constructed such that it satisfies all the constraints in E (thus ensuring that no abstract counterexample will re-appear). In the construction, we make sure to satisfy **S3** (fairness). Therefore, for every abstract state sˆ, we choose a value f (ˆs) that satisfies the constraints in <sup>E</sup> and is *non-starving*: a value <sup>M</sup> is starving for <sup>s</sup><sup>ˆ</sup> if <sup>s</sup><sup>ˆ</sup> <sup>|</sup><sup>=</sup> *<sup>k</sup> <sup>j</sup>*=1 ¬ϕ*<sup>F</sup> <sup>j</sup>* (B) but sˆ |= *<sup>j</sup>*∈*<sup>M</sup>* <sup>¬</sup>ϕ*<sup>F</sup> <sup>j</sup>* (B), i.e., some of the copies have not terminated in <sup>s</sup><sup>ˆ</sup> but none of the non-terminating copies is scheduled. (Due to adequacy, a value M is starving for sˆ if and only if it is starving for every s ∈ γ(ˆs).)

If for some abstract state sˆ, all the non-starving values have already been excluded (i.e., (ˆs, M) ∈ E for every non-starving M), we conclude that there is *no* f such that sˆ is reachable in A<sup>P</sup> (T*<sup>f</sup>*- ) and f is part of a composition-invariant pair:

**Lemma 6.** *Let* <sup>s</sup><sup>ˆ</sup> <sup>∈</sup> <sup>S</sup><sup>ˆ</sup> *be an abstract state such that for every* <sup>∅</sup> <sup>=</sup> <sup>M</sup> ⊆ {1..k} *either* <sup>M</sup> *is starving for* <sup>s</sup><sup>ˆ</sup> *or* (ˆs, M) <sup>∈</sup> <sup>E</sup>*. Then, for every* <sup>f</sup> *that satisfies* **S3***, if* <sup>A</sup><sup>P</sup> (T*<sup>f</sup>*- ) *satisfies* **S1** *and* **S2***, then* sˆ *is unreachable in* A<sup>P</sup> (T*<sup>f</sup>*- )*.*

**Corollary 2.** *If there exists a composition-invariant pair* (f ,*Inv* )*, then* sˆ *is unreachable in* A<sup>P</sup> (T*<sup>f</sup>*- )*.*

This is because no matter how the self composition function f would be defined, sˆ is guaranteed to have an outgoing abstract counterexample trace in A<sup>P</sup> (T*<sup>f</sup>*- ).

We, therefore, turn f (ˆs) to be undefined. As a result, condition **S2** of Algorithm 4 requires that sˆ will be unreachable in A<sup>P</sup> (T*<sup>f</sup>*- ). In Algorithm 1, this is enforced by adding sˆ to *Unreach* (Line 8).

Every abstract state sˆ that is added to *Unreach* is a strengthening of the safety property by an additional constraint that needs to be obeyed in any composition-invariant pair, where obtaining a composition-invariant pair is the target of the algorithm. This makes our algorithm *property directed*.

If an abstract state that satisfies ϕ*pre*(B)is added to *Unreach*, then Algorithm 1 determines that no solution exists (Line 9). Otherwise, it generates a new constraint for E based on the abstract state preceding sˆ in the abstract counterexample (Line 12).

**Constructing the Next Candidate Self Composition Function.** Given the set of constraints in E and the formula *Unreach*, Modify SC (Line 13) generates the next candidate composition function by (i) taking a constraint (ˆs, M) such that sˆ |= *Unreach* (typically the one that was added last), (ii) selecting a non-starving value Mnew for sˆ (such

<sup>6</sup> If the conditions {C*M*}*<sup>M</sup>* defining <sup>f</sup> may overlap, we consider the condition <sup>C</sup>*<sup>M</sup>* by which the transition from sˆ*<sup>m</sup>* to sˆ*m*+1 was defined.

a value must exist, otherwise sˆ would have been added to *Unreach*), and (iii) updating the conditions defining f as follows:

$$C\_M' = C\_M \land \neg \hat{s}(\mathcal{P}) \qquad \qquad \qquad C\_{M\_{\text{new}}}' = (C\_{M\_{\text{new}}} \lor \hat{s}(\mathcal{P})) $$

The conditions of other values remain as before. This definition is facilitated by the fact that the same set of predicates is used both for defining f and for defining the abstract states <sup>s</sup><sup>ˆ</sup> <sup>∈</sup> <sup>S</sup><sup>ˆ</sup> (by which *Inv* is obtained). Note that in practice we do not explicitly turn f to be undefined for γ(*Unreach*). However, these definitions are ignored. The definition ensures that f is non-starving (satisfying condition **S3**) and that no two conditions C *<sup>M</sup>*<sup>1</sup> = C *<sup>M</sup>*<sup>2</sup> overlap. While the latter is not required, it also does not restrict the generality of the approach (since the language we consider is closed under Boolean operations).

**Theorem 2.** *Let* T *be a transition system,* (*pre*, *post*) *a* k*-safety property and* P *a set of predicates over* <sup>V</sup>*<sup>k</sup>. If Algorithm <sup>1</sup> returns "no solution" then there is no compositioninvariant pair for* T *and* (*pre*, *post*) *in* L<sup>P</sup> *. Otherwise,* (f,*Inv*(P)) *returned by Algorithm <sup>1</sup> is a composition-invariant pair in* <sup>L</sup><sup>P</sup> *, and thus* <sup>T</sup> <sup>|</sup>=*<sup>k</sup>* (*pre*, *post*)*.*

*Complexity.* Each iteration of Algorithm 1 adds at least one constraint to E, excluding a potential value for f over some abstract state sˆ. An excluded values is never re-used. Hence, the number of iterations is at most the number of abstract states, 2|P|, multiplied by the number of potential values for each abstract state, n = 2*<sup>k</sup>*. Altogether, the number of iterations is at most <sup>O</sup>(2|P| · <sup>2</sup>*<sup>k</sup>*). Each iteration makes one call to Abs Reach which checks reachability via predicate abstraction, hence, assuming that satisfiability checks in the original logic are at most exponential, its complexity is 2*<sup>O</sup>*(|P|) . Therefore, the overall complexity of the algorithm is 2*<sup>O</sup>*(|P|)+*<sup>k</sup>*. Typically, k is a small constant, hence the complexity is dominated by 2*<sup>O</sup>*(|P|).

#### **5 Evaluation and Conclusion**

*Implementation.* We implemented PDSC (Algorithm 1) in Python on top of Z3 [25]. Its input is a transition system encoded by Constrained Horn Clauses (CHC) in SMT2 format, a k-safety property and a set of predicates. The abstraction is implicitly encoded using the approach of [6], and is parameterized by a composition function that is modified in each iteration. For reachability checks (Abs Reach) we use SPACER [22], which supports LRA and arrays. For the set of predicates used by PDSC, we implemented an automatic procedure that mines these predicates from the CHC. Additional predicates may be added manually.

*Experiments.* To evaluate PDSC, we compare it to SYNONYM [26], the current state of the art in k-safety verification.

To show the effectiveness of PDSC, we consider examples that require a *nontrivial* composition (these examples are detailed in [29]). We emphasize that the motivation for these example is originated in real-life scenarios. For example, Fig. 1 follows a pattern of constant-time execution. The results of these experiments are summarized in Table 1.


**Table 1.** Examples that require semantic compositions

**Fig. 2.** Runtime comparison (in sec.): PDSC (x-axis) and SYNONYM (y-axis).

PDSC is able to find the right composition function and prove all of the examples, while SYNONYM cannot verify any of them. We emphasize that for these examples, lock-step composition is not sufficient. However, PDSC infers a composition that depends on the programs' state (variable values), rather than just program locations.

Next we consider Java programs from [26,30], which we manually converted to C, and then converted to CHC using SEAHORN [19]. For all but 3 examples, only 2 types of predicates, which we mined automatically, were sufficient for verification: (i) relational predicates derived from the pre- and post-conditions, and (ii) for simple loops that have an index variable (e.g., for iterating over an array), an equality predicate between the copies of the indices. These predicates were sufficient since we used a large-step encoding of the transition relation, hence the abstraction via predicates takes effect only at cut-points. For the remaining 3 examples, we manually added 2–4 predicates. With the exception of 1 example where a timeout of 10 seconds was reached, all examples were solved with a lock-step composition function. Yet, we include them to show that on examples with simple compositions PDSC performs similarly to SYNONYM. This can be seen in Fig. 2.

**Conclusion and Future Work.** This work formulates the problem of inferring a self composition function together with an inductive invariant for the composed program, thus capturing the interplay between the self composition and the difficulty of verifying the resulting composed program. To address this problem we present PDSC– an algorithm for inferring a semantic self composition, directed at verifying the composed program with a given language of predicates. We show that PDSC manages to find nontrivial self compositions that are beyond reach of existing tools. In future work, we are interested in further improving PDSC by extending it with additional (possibly lazy) predicate discovery abilities. This has the potential to both improve performance and verify properties over wider range of programs. Additionally, we consider exploring further generalization techniques during the inference procedure.

**Acknowledgements.** This publication is part of a project that has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement No [759102-SVIS]). The research was partially supported by Len Blavatnik and the Blavatnik Family foundation, the Blavatnik Interdisciplinary Cyber Research Center, Tel Aviv University, the Israel Science Foundation (ISF) under grant No. 1810/18 and the United States-Israel Binational Science Foundation (BSF) grant No. 2016260.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Security-Aware Synthesis Using Delayed-Action Games**

Mahmoud Elfar(B) , Yu Wang , and Miroslav Pajic

Duke University, Durham, NC 27708, USA {mahmoud.elfar,yu.wang094,miroslav.pajic}@duke.edu

**Abstract.** Stochastic multiplayer games (SMGs) have gained attention in the field of strategy synthesis for multi-agent reactive systems. However, standard SMGs are limited to modeling systems where all agents have full knowledge of the state of the game. In this paper, we introduce delayed-action games (DAGs) formalism that simulates hiddeninformation games (HIGs) as SMGs, where hidden information is captured by delaying a player's actions. The elimination of private variables enables the usage of SMG off-the-shelf model checkers to implement HIGs. Furthermore, we demonstrate how a DAG can be decomposed into subgames that can be independently explored, utilizing parallel computation to reduce the model checking time, while alleviating the state space explosion problem that SMGs are notorious for. In addition, we propose a DAG-based framework for strategy synthesis and analysis. Finally, we demonstrate applicability of the DAG-based synthesis framework on a case study of a human-on-the-loop unmanned-aerial vehicle system under stealthy attacks, where the proposed framework is used to formally model, analyze and synthesize security-aware strategies for the system.

#### **1 Introduction**

Stochastic multiplayer games (SMGs) are used to model reactive systems where nondeterministic decisions are made by multiple players [4,13,23]. SMGs extend probabilistic automata by assigning a player to each choice to be made in the game. This extension enables modeling of complex systems where the behavior of players is unknown at design time. The *strategy synthesis* problem aims to find a *winning strategy*, i.e., a strategy that guarantees that a set of objectives (or winning conditions) is satisfied [6,21]. Algorithms for synthesis include, for instance, value iteration and strategy iteration techniques, where multiple reward-based objectives are satisfied [2,9,17]. To tackle the state-space explosion problem, [29] presents an *assume-guarantee* synthesis framework that relies on synthesizing strategies on the component level first, before composing them into a global winning strategy. Mean-payoffs and ratio rewards are further investigated in [3]

This work was supported by the NSF CNS-1652544 grant, as well as the ONR N00014- 17-1-2012 and N00014-17-1-2504, and AFOSR FA9550-19-1-0169 awards.

to synthesize ε-optimal strategies. Formal tools that support strategy synthesis via SMGs include PRISM-games [7,19] and Uppaal Stratego [10].

SMGs are classified based on the number of players that can make choices at each state. In *concurrent* games, more than one player is allowed to concurrently make choices at a given state. Conversely, *turn-based* games assign one player at most to each state. Another classification considers the information available to different players across the game [27]. *Complete-information* games (also known as *perfect-information* games [5]) grant all players complete access to the information within the game. In *symmetric* games, some information is equally hidden from all players. On the contrary, *asymmetric* games allow some players to have access to more information than the others [27].

This work is motivated by security-aware systems in which stealthy adversarial actions are potentially hidden from the system, where the latter can probabilistically and intermittently gain full knowledge about the current state. While hidden-information games (HIGs) can be used to model such systems by using private variables to capture hidden information [5], standard model checkers can only synthesize strategies for (full-information) SMGs; thus, demanding for alternative representations. The equivalence between turn-based semi-perfect information games and concurrent perfect-information games was shown [5]. Since a player's strategy mainly rely on full knowledge of the game state [9], using SMGs for synthesis produces strategies that may violate synthesis specifications in cases where required information is hidden from the player. *Partiallyobservable* stochastic games (POSGs) allow agents to have different belief states by incorporating uncertainty about both the current state and adversarial plans [15]. Techniques such as active sensing for online replanning [14] and grid-based abstractions of belief spaces [24] were proposed to mitigate synthesis complexity arising from partial observability. The notion of *delaying actions* has been studied as means for gaining information about a game to improve future strategies [18,30], but was not deployed as means for hiding information.

To this end, we introduce delayed-action games (DAGs)—a new class of games that simulate HIGs, where information is hidden from one player by delaying the actions of the others. The omission of private variables enables the use of off-the-shelf tools to implement and analyze DAG-based models. We show how DAGs (under some mild and practical assumptions) can be decomposed into subgames that can be independently explored, reducing the time required for synthesis by employing parallel computation. Moreover, we propose a DAGbased framework for strategy synthesis and analysis of security-aware systems. Finally, we demonstrate the framework's applicability through a case study of security-aware planning for an unmanned-aerial vehicle (UAV) system prone to stealthy cyber attacks, where we develop a DAG-based system model and further synthesize strategies with strong probabilistic security guarantees.

The paper is organized as follows. Section 2 presents SMGs, HIGs, and problem formulation. In Sect. 3, we introduce DAGs and show that they can simulate HIGs. Section 4 proposes a DAG-based synthesis framework, which we use for security-aware planning for UAVs in Sect. 5, before concluding the paper in Sect. 6.

#### **2 Stochastic Games**

In this section, we present turn-based stochastic games, which assume that all players have full information about the game state. We then introduce hiddeninformation games and their private-variable semantics.

**Notation.** We use <sup>N</sup><sup>0</sup> to denote the set of non-negative integers. <sup>P</sup>(A) denotes the powerset of A (i.e., 2<sup>A</sup>). A variable v has a set of valuations *Ev* (v), where η (v) ∈ *Ev* (v) denotes one. We use Σ<sup>∗</sup> to denote the set of all finite words over alphabet Σ, including the empty word . The mapping *Eff* :Σ<sup>∗</sup>×*Ev* (v)→*Ev* (v) indicates the effect of a finite word on η (v). Finally, for general indexing, we use <sup>s</sup><sup>i</sup> or <sup>s</sup>(i), for <sup>i</sup> <sup>∈</sup> <sup>N</sup>0, while PL<sup>γ</sup> denotes *Player* <sup>γ</sup>.

**Turn-Based Stochastic Games (SMGs).** SMGs can be used to model reactive systems that undergo both stochastic and nondeterministic transitions from one state to another. In a *turn-based* game,<sup>1</sup> actions can be taken at any state by at most one player. Formally, an SMG can be defined as follows [1,28,29].

**Definition 1 (Turn-Based Stochastic Game).** *A* turn-based game (SMG) *with players* Γ = {I,II,} *is a tuple* G = S,(SI, SII, S-), A, s0, δ*, where*


For all <sup>s</sup><sup>∈</sup> <sup>S</sup>I∪SII and <sup>a</sup> <sup>∈</sup> <sup>A</sup>I∪AII, we write <sup>s</sup> <sup>a</sup> s if δ(s, a, s )= 1. Similarly, for all s∈S we write <sup>s</sup> <sup>p</sup> s if s is randomly sampled with probability p=δ(s, τ, s ).

**Hidden-Information Games.** SMGs assume that all players have full knowledge of the current state, and hence provide *perfect-information* models [5]. In many applications, however, this assumption may not hold. A great example are security-aware models where stealthy adversarial actions can be hidden from the system; e.g., the system may not even be aware that it is under attack. On the other hand, *hidden-information* games (HIGs) refer to games where one player does not have complete access to (or knowledge of) the current state. The notion of hidden information can be formalized with the use of *private variables* (PVs) [5]. Specifically, a game state can be encoded using variables v<sup>T</sup> and vB, representing the true information, which is only known to PLI, and PLII belief, respectively.

<sup>1</sup> The term turn-based indicates that at any state only one player can play an action. It does not necessarily imply that players take fair turns.

**Definition 2 (Hidden-Information Game).** *A* hidden-information stochastic game (HIG) *with players* Γ = {I,II, } *over a set of variables* V = {v<sup>T</sup> , vB} *is a tuple* G<sup>H</sup> = S,(SI, SII, S-), A, s0, β, δ*, where*


In the above definition, δ only allows transitions s<sup>I</sup> to sII, sII to s<sup>I</sup> or s-, with sII to s conditioned by action θ, and probabilistic transitions s to sII. A game state can be written as s = (t, u, Ω, γ), but to simplify notation we use s<sup>γ</sup> (t, u, Ω) instead, where t ∈ *Ev* (v<sup>T</sup> ) is the *true* value of the game, u ∈ *Ev* (vB) is PLII current *belief*, Ω ∈ P(*Ev* (v<sup>T</sup> )) \ {∅} is PLII *belief space*, and γ ∈ Γ is the current player's index. When the truth is hidden from PLII, the belief space Ω is the *information set* [27], capturing PLII knowledge about the possible true values.

*Example 1 (Belief vs. True Value).* Our motivating example is a system that consists of a UAV and a human operator. For localization, the UAV mainly relies on a GPS sensor that can be compromised to effectively steer the UAV away from its original path. While aggressive attacks can be detected, some may remain stealthy by introducing only bounded errors at each

**Fig. 1.** The UAV belief (solid square) vs. the true value (solid diamond) of its location.

step [16,20,22,26]. For example, Fig. 1 shows a UAV (PLII) occupying zone A and flying north (N). An adversary (PLI) can launch a stealthy attack targeting its GPS, introducing a bounded error (NE, NW) to remain stealthy. The set of stealthy actions available to the attacker depends on the preceding UAV action, which is captured by the function β, where β(N)={NE, N, NW}. Being unaware of the attack, the UAV believes that it is entering zone C, while the true new location is D due to the attack (NE). Initially, η (v<sup>T</sup> )=η (vB)=zA, and Ω ={zA} as the UAV is certain it is in zone zA. In s2, η (vB) = z<sup>C</sup> , yet η (v<sup>T</sup> ) = zD. Although v<sup>T</sup> is hidden, PLII is aware that η (v<sup>T</sup> ) is in Ω ={zB, z<sup>C</sup> , zD}.

**HIG Semantics.** G<sup>H</sup> semantics is described using the rules shown in Fig. 2, where H2 and H3 capture PLII and PL<sup>I</sup> moves, respectively. The rule H4 specifies that a PLII attempt θ to reveal the true value can succeed with probability p<sup>i</sup> where PLII belief is updated (i.e., u = t), and remains unchanged otherwise.

**Fig. 2.** Semantic rules for an HIG.

*Example 2 (HIG Semantics).* Continuing Example 1, let us assume that the set of actions A<sup>I</sup> = AII = {N, S, E, W, NE, NW, SE, SW}, and that θ=GT is a geolocation task that attempts to reveal the true value of the game.<sup>2</sup> Now, consider the scenario illustrated in Fig. 3. At the initial state s0, the UAV attempts to move north (N), progressing the game to the state s1, where the adversary takes her turn by selecting an action from the set β(N) = {NE, N, NW}. The players take turns until the UAV performs a geolocation task GT, moving from the state s<sup>4</sup> to s5. With probability p = δ(s5,τ,s6), the UAV detects its true location and updates its belief accordingly (i.e., to s6). Otherwise, the belief remains the same (i.e., equal to s4).

**Fig. 3.** An example of the UAV motion in a 2D-grid map, modeled as an HIG. Solid squares represent the UAV belief, while solid diamonds represent the ground truth. The UAV action GT denotes performing a geolocation task.

**Problem Formulation.** Following the system described in Example 2, we now consider the composed HIG G<sup>H</sup> = MadvMuavMas shown in Fig. 4; the HIG-based model incorporates standard models of a UAV (Muav), an adversary (Madv), and a geolocation-task advisory system (Mas) (e.g., as introduced in [11,12]). Here, the probability of a successful detection p(v<sup>T</sup> , vB) is a function of both the location the UAV believes to be its current location (vB) as well

<sup>2</sup> A geolocation task is an attempt to localize the UAV by examining its camera feed.

as the ground truth location that the UAV actually occupies (v<sup>T</sup> ). Reasoning about the flight plan using such model becomes problematic since the ground truth v<sup>T</sup> is inherently unknown to the UAV (i.e., PLII), and thus so is p(v<sup>T</sup> , vB). Furthermore, such representation, where some information is hidden, is not supported by off-the-shelf SMG model checkers. Consequently, for such HIGs, *our goal is to find an alternative representation that is suitable for strategy synthesis using off-the-shelf SMG model-checkers.*

**Fig. 4.** An example of an HIG-based system model comprised of the UAV (Muav), the adversary (Madv), and the AS (Mas). Framed information is hidden from the UAV-AS.

#### **3 Delayed-Action Games**

In this section, we provide an alternative representation of HIGs that eliminates the use of private variables—we introduce Delayed-Action Games (DAGs) that exploit the notion of *delayed actions*. Furthermore, we show that for any HIG, a DAG that simulates the former can be constructed.

**Delayed Actions.** Informally, a DAG reconstructs an HIG such that actions of PL<sup>I</sup> (the player with access to perfect information) follow the actions of PLII, i.e., PL<sup>I</sup> actions are *delayed*. This rearrangement of the players' actions provides a means to hide information from PLII without the use of private variables, since in this case, at PLII states, PL<sup>I</sup> actions have not occurred yet. In this way, PLII can act as though she has complete information at the moment she makes her decision, as the future state has not yet happened and so cannot be known. In essence, the formalism can be seen as a partial ordering of the players' actions, exploiting the (partial) superposition property that a wide class of physical systems exhibit. To demonstrate this notion, let us consider DAG modeling on our running example.

*Example 3 (Delaying Actions).* Figure 5 depicts the (HIG-based) scenario from Fig. 3, but in the corresponding DAG, where the UAV actions are performed first (in ˆs0, sˆ1, sˆ2), followed by the adversary delayed actions (in ˆs3, sˆ4). Note that, in the DAG model, at the time the UAV executed its actions (ˆs0, sˆ1, sˆ2) the adversary actions had not occurred (yet). Moreover, ˆs<sup>0</sup> and ˆs<sup>6</sup> (Fig. 5) share the same belief and true values as s<sup>0</sup> and s<sup>6</sup> (Fig. 3), respectively, though the transient states do not exactly match. This will be used to show the relationship between the games.

**Fig. 5.** The same scenario as in Fig. 3, modeled as a DAG. Solid squares represent UAV belief, while solid diamonds represent the ground truth. The UAV action GT denotes performing a geolocation task.

The advantage of this approach is twofold. First, the elimination of private variables enables simulation of an HIG using a full-information game. Thus, the formulation of the strategy synthesis problem using off-the-shelf SMG-based tools becomes feasible. In particular, a PLII synthesized strategy becomes dependent on the knowledge of PL<sup>I</sup> behavior (possible actions), rather than the specific (hidden) actions. We formalize a DAG as follows.

**Definition 3 (Delayed-Action Game).** *A* DAG *of an HIG* G<sup>H</sup> = S,(SI, SII, S-), A, s0, β, δ*, with players* Γ = {I,II,} *over a set of variables* V = {v<sup>T</sup> , vB} *is a tuple* <sup>G</sup><sup>D</sup> <sup>=</sup> S, <sup>ˆ</sup> (SˆI, <sup>S</sup>ˆII, <sup>S</sup>ˆ-), A, <sup>s</sup>ˆ0, β, <sup>ˆ</sup>δ *where*


Note that, in contrast to transition function <sup>δ</sup> in HIG <sup>G</sup>H, <sup>ˆ</sup><sup>δ</sup> in DAG <sup>G</sup><sup>D</sup> only allows transitions ˆsII to ˆsII or ˆsI, as well as ˆs<sup>I</sup> to ˆs<sup>I</sup> or ˆs-, and probabilistic transitions ˆsto ˆsII; also note that ˆsII to ˆs<sup>I</sup> is conditioned by the action θ.

**DAG Semantics.** A DAG state is a tuple ˆs= t,ˆ u, w, j, γ ˆ , which for simplicity we shorthand as ˆs<sup>γ</sup> t,ˆ u, w, j ˆ , where t <sup>ˆ</sup> <sup>∈</sup> *Ev* (v<sup>T</sup> ) is the last known true value, uˆ ∈ *Ev* (vB) is PLII belief, w ∈ A<sup>∗</sup> II captures PLII actions taken since the last known true value, <sup>j</sup> <sup>∈</sup> <sup>N</sup><sup>0</sup> is an index on <sup>w</sup>, and <sup>γ</sup> <sup>∈</sup> <sup>Γ</sup> is the current player index. The game transitions are defined using the semantic rules from Fig. 6. Note that PLII can execute multiple moves (i.e., actions) before executing θ to attempt to reveal the true value (D2), moving to a PL<sup>I</sup> state where PL<sup>I</sup> executes all her delayed actions before reaching a 'revealing' state ˆs- (D3). Finally, the revealing attempt can succeed with probability p<sup>i</sup> where PLII belief is updated (i.e., ˆu =t ˆ), or otherwise remains unchanged (D4).

In both G<sup>H</sup> and GD, we label states where all players have full knowledge of the current state as *proper*. We also say that two states are similar if they agree on the belief, and equivalent if they agree on both the belief and ground truth.

**Definition 4 (States).** *Let* <sup>s</sup>γ(t, u, Ω) <sup>∈</sup> <sup>S</sup> *and* <sup>s</sup>ˆγ<sup>ˆ</sup>(t,<sup>ˆ</sup> u, w, j <sup>ˆ</sup> ) <sup>∈</sup> <sup>S</sup>ˆ*. We say:*


From the above definition, we have that s sˆ =⇒ s ∈ Prop(GH), sˆ ∈ Prop(GD). We now define *execution fragments*, possible progressions from a state to another.

**Definition 5 (Execution Fragment).** *An* execution fragment *(of either an SMG, DAG or HIG) is a finite sequence of states, actions and probabilities*

 = s0a1p1s1a2p2s<sup>2</sup> ...anpns<sup>n</sup> *such that* (s<sup>i</sup> <sup>a</sup>i+1 si+1)∨(s<sup>i</sup> pi+1 si+1), ∀i ≥ 0*.* 3

We use *first*( ) and *last*( ) to refer to the first and last states of , respectively. If both states are proper, we say that is *proper* as well, denoted by <sup>∈</sup> Prop(GH).<sup>4</sup> Moreover, is *deterministic* if no probabilities appear in the sequence.

**Definition 6 (Move).** *A* move m<sup>γ</sup> *of an execution from state* s ∈ *, denoted by move*γ(s, )*, is a sequence of actions* a1a<sup>2</sup> ...a<sup>i</sup> ∈ A<sup>∗</sup> <sup>γ</sup> *that player* γ *performs in starting from* s*.*

By omitting the player index we refer to the moves of all players. To simplify notation, we use *move*( ) as a short notation for *move*(*first*( ), ). We write (m)(*first*( )) = *last*( ) to denote that the execution of move m from the *first*( ) leads to the *last*( ). This allows us to now define the *delay operator* as follows.

<sup>3</sup> For deterministic transitions, p = 1, hence omitted from for readability.

<sup>4</sup> An execution fragment lives in the transition system (TS), i.e., - ∈ Prop(TS(G)). We omit TS for readability.

**Definition 7 (Delay Operator).** *For an* GH*, let* m = *move*( ) = a1b<sup>1</sup> ...anbnθ *be a move for some deterministic* ∈ TS(GH)*, where* a1...a<sup>n</sup> ∈ A∗ II, b1...b<sup>n</sup> ∈ A<sup>∗</sup> <sup>I</sup> *. The delay operator, denoted by* m*, is defined by the rule* m = a<sup>1</sup> ...anθb<sup>1</sup> ...bn*.*

Intuitively, the delay operator shifts PL<sup>I</sup> actions to the right of PLII actions up until the next probabilistic state. For example,

if ρ = s(0) II a1 s(1) I b2 s(2) II θ s(3) p3 s(4) II a4 s(5) I b5 s(6) II a6 s(7) I b7 s(8) II then m = a<sup>1</sup> b<sup>2</sup> θ τ a<sup>4</sup> b<sup>5</sup> a<sup>6</sup> b7, and m = a<sup>1</sup> θ b<sup>2</sup> τ a<sup>4</sup> a<sup>6</sup> b<sup>5</sup> b7.

**Simulation Relation.** Given an HIG GH, we first define the corresponding DAG GD.

**Definition 8 (Correspondence).** *Given an HIG* GH*, a corresponding DAG* G<sup>D</sup> = D[GH] *is a DAG that follows the semantic rules displayed in Fig. 7.*

**Fig. 7.** Semantic rules for HIG-to-DAG transformation.

For the rest of this section, we consider G<sup>D</sup> = D[GH], and use ∈ TS(GH) and ˆ ∈ TS(GD) to denote two execution fragments of the HIG and DAG, respectively. We say that and ˆ are *similar*, denoted by ∼ ˆ, iff *first*( ) *first*(ˆ ), *last*( ) ∼ *last*(ˆ ), and *move*( ) = *move*(ˆ ).

**Definition 9 (Game Proper Simulation).** *A game* G<sup>D</sup> *properly simulates* GH*, denoted by* G<sup>D</sup> -GH*, iff* ∀ ∈ Prop(GH)*,* ∃ ˆ ∈ Prop(GD) *such that* ∼ ˆ*.*

Before proving the existence of the simulation relation, we first show that if a move is executed on two equivalent states, then the terminal states are similar.

**Lemma 1 (Terminal States Similarity).** *For any* s<sup>0</sup> sˆ<sup>0</sup> *and a deterministic* ∈TS(GH) *where first*( ) =s0*, last*( ) ∈ SII*, then last*( )∼ *move*( ) (ˆs0) *holds.*

*Proof.* Let *last*( <sup>i</sup>) = s (i) <sup>γ</sup><sup>i</sup> (ti, ui, Ωi) and *move*( <sup>i</sup>) (ˆs0)=ˆs (i) <sup>γ</sup>ˆ<sup>i</sup> (t ˆi, uˆi, wi, ji), where *move*( <sup>i</sup>) = a1b1...aibiθ. We then write *move*( ) = a1...aiθb1...bi. We use induction over i as follows:


**Theorem 1 (Probabilistic Simulation).** *For any* s<sup>0</sup> sˆ<sup>0</sup> *and* ∈ Prop(GH) *where first*( ) = s0*, it holds that*

$$\Pr\left[last(\varrho) = s'\right] = \Pr\left[\left(\overline{move(\varrho)}\right)(\hat{s}\_0) = \hat{s}'\right] \quad \forall s', \hat{s}' \quad s.t. \quad s' \simeq \hat{s}'.$$

*Proof.* We can rewrite as = <sup>0</sup> p1 - <sup>1</sup> ··· <sup>n</sup>−<sup>1</sup> pn s (n) II , where <sup>0</sup>, 1,..., <sup>n</sup>−<sup>1</sup> are deterministic. Let *first*( <sup>i</sup>) = s (i) II (ti, ui, Ωi), *last*( <sup>i</sup>) = s (i) - (t i, u i, Ω <sup>i</sup>), and *move*( ) (ˆs0)=ˆs(n)(t ˆn, uˆn, wn, jn). We use induction over n as follows:


Note that in case of multiple θ attempts, the above probability P satisfies

$$P = \prod\_{i=1}^{n} \sum\_{j=1}^{m\_i} p\_i \left( t'\_{i-1}, u'\_{i-1} \right) \left( 1 - p\_{i-1} \left( t'\_{i-1}, u'\_{i-1} \right) \right)^{(j-1)},$$

where m<sup>i</sup> is the number of θ attempts at stage i. Finally, since Theorem 1 imposes no constraints on *move*( ), a DAG can simulate all proper executions that exist in the corresponding HIG.

**Theorem 2 (DAG-HIG Simulation).** *For any HIG* G<sup>H</sup> *there exists a DAG* G<sup>D</sup> = D[GH] *such that* G<sup>D</sup> -G<sup>H</sup> *(as defined in Definition 9).*

#### **4 Properties of DAG and DAG-based Synthesis**

We here discuss DAG features, including how it can be decomposed into subgames by restricting the simulation to finite executions, and the preservation of safety properties, before proposing a DAG-based synthesis framework.

**Transitions.** In DAGs, nondeterministic actions of different players underline different semantics. Specifically, PL<sup>I</sup> nondeterminism captures what is known about the adversarial behavior, rather than exact actions, where PL<sup>I</sup> actions are constrained by the earlier PLII action. Conversely, PLII nondeterminism abstracts the player's decisions. This distinction reflects how DAGs can be used for strategy synthesis under hidden information. To illustrate this, suppose that a strategy πII is to be obtained based on a worst-case scenario. In that case, the game is explored for all possible adversarial behaviors. Yet, if a strategy π<sup>I</sup> is known about PLI, a counter strategy πII can be found by constructing GπI D .

Probabilistic behaviors in DAGs are captured by PL-, which is characterized by the transition function <sup>ˆ</sup><sup>δ</sup> : <sup>S</sup>ˆ- <sup>×</sup> <sup>S</sup>ˆII <sup>→</sup> [0, 1]. The specific definition of ˆδ depends on the modeled system. For instance, if the transition function (i.e., the probability) is state-independent, i.e., <sup>ˆ</sup>δ(ˆs-, sˆII) = c, c ∈ [0, 1], the obtained model becomes trivial. Yet, with a state-dependent transition function, i.e., <sup>ˆ</sup>δ(ˆs-, sˆII) = p(t,ˆ uˆ), the probability that PLII successfully reveals the true value depends on both the belief and the true value, and the transition function can then be realized since ˆs holds both t ˆ and ˆu.

**Decomposition.** Consider an execution ˆ <sup>∗</sup> = ˆs0a1sˆ1a2sˆ<sup>2</sup> ... that describes a scenario where PLII performs infinitely many actions with no attempt to reveal the true value. To simulate ˆ <sup>∗</sup>, the word w needs to infinitely grow. Since we are interested in finite executions, we impose *stopping criteria* on the DAG, such that the game is *trapped* whenever <sup>|</sup>w<sup>|</sup> <sup>=</sup> <sup>h</sup>max is true, where <sup>h</sup>max <sup>∈</sup> <sup>N</sup> is an *upper horizon*. We formalize the stopping criteria as a deterministic finite automaton (DFA) that, when composed with the DAG, traps the game whenever the stopping criteria hold. Note that imposing an upper horizon by itself is not a sufficient criterion for a DAG to be considered a stopping game [8]. Conversely, consider a proper (and hence finite) execution ˆ = ˆs0a<sup>1</sup> ... sˆ , where ˆs0, sˆ ∈ Prop(GD). From Definition 9, it follows that a DAG initial state is strictly proper, i.e., ˆs<sup>0</sup> ∈ Prop(GD). Hence, when ˆs is reached, the game can be seen as if it is *repeated* with a new initial state ˆs . Consequently, a DAG game (complemented with stopping criteria) can be decomposed into a (possibly infinite) countable set of *subgames* that have the same structure yet different initial states.

**Definition 10 (DAG Subgames).** *The* subgames *of a* G<sup>D</sup> *are defined by the set* Gˆi <sup>G</sup>ˆ<sup>i</sup> <sup>=</sup> Sˆ(i) ,(Sˆ(i) <sup>I</sup> , <sup>S</sup>ˆ(i) II , <sup>S</sup>ˆ(i) - ), A, sˆ (i) <sup>0</sup> , <sup>ˆ</sup>δ(i) , i <sup>∈</sup> <sup>N</sup><sup>0</sup> , *where* Sˆ = <sup>i</sup> <sup>S</sup>ˆ(i) *;* Sˆ<sup>γ</sup> = <sup>i</sup> <sup>S</sup>ˆ(i) <sup>γ</sup> ∀γ ∈ Γ*; and* sˆ (i) <sup>0</sup> = ˆs (i) II *s.t.* sˆ (i) II <sup>∈</sup> Prop(G(i) <sup>D</sup> ) , sˆ (i) II = ˆs (j) II <sup>∀</sup>i, j <sup>∈</sup> <sup>N</sup>0*.*

Intuitively, each subgame either reaches a proper state (representing the initial state of another subgame) or terminates by an upper horizon. This decomposition allows for the independent (and parallel) analysis of individual subgames, drastically reducing both the time required for synthesis and the explored state space, and hence improving scalability. An example of this decompositional approach is provided in Sect. 5.

**Preservation of Safety Properties.** In DAGs, the action θ denotes a transition from PLII to PL<sup>I</sup> states and thus the execution of any delayed actions. While this action can simply describe a revealing attempt, it can also serve as a *what-if* analysis of how the true value may evolve at stage i of a subgame. We refer to an execution of the second type as a *hypothetical branch*, where Hyp(ˆ, h) denotes the set of hypothetical branches from ˆ at stage h ∈ {1,...,n}. Let Lsafe(s) be a labeling function denoting if a state is safe. The formula Φsafe := [G*safe*] is satisfied by an execution in HIG iff all s(t, u, Ω) ∈ are safe.

Now, consider ˆ of the DAG, with ˆ ∼ . We identify the following three cases:


Taking into account such relations, both safety (e.g., never encounter a hazard) and distance-based requirements (e.g., never exceed a subgame horizon) can be specified when using DAGs for synthesis, to ensure their satisfaction in the original model. This can be generalized to other reward-based synthesis objectives, which will be part of our future efforts that we discuss in Sect. 6.

**Synthesis Framework.** We here propose a framework for strategy synthesis using DAGs, which is summarized in Fig. 8. We start by formulating the automata MI, MII and M-, representing PLI, PLII and PL abstract behaviors, respectively. Next, a FIFO memory stack (mi)<sup>n</sup> <sup>i</sup>=1 <sup>∈</sup> <sup>A</sup><sup>n</sup> II is implemented using two automata Mmrd and Mmwr to perform reading and writing operations, respectively.<sup>5</sup> The DAG <sup>G</sup><sup>D</sup> is constructed by following Algorithm 1. The game starts with PLII moves until she executes a revealing attempt θ, allowing PL<sup>I</sup> to play her delayed actions. Once an end criterion is met, the game terminates, resembling conditions such as 'running out of fuel' or 'reaching map boundaries'.

**Fig. 8.** Synthesis and analysis framework based on the use of DAGs.

<sup>5</sup> Specific implementation details are described in Sect. 5.


Algorithm 2 describes the procedure for strategy synthesis based on the DAG GD, and an rPATL [6] synthesis query φsyn that captures, for example, a safety requirement. Starting with the initial location, the procedure checks whether φsyn is satisfied if action θ is performed at stage h, and updates the set of feasible strategies <sup>Π</sup><sup>i</sup> for subgame <sup>G</sup>ˆ<sup>i</sup> until <sup>h</sup>max is reached or <sup>φ</sup>syn is not satisfied.<sup>6</sup> Next, the set Π<sup>i</sup> is used to update the list of reachable end locations with new initial locations of reachable subgames that should be explored. Finally, the composition of both G<sup>H</sup> and Π<sup>∗</sup> II resolves PLII nondeterminism, where the resulting model <sup>G</sup><sup>Π</sup><sup>∗</sup> II <sup>H</sup> is a Markov Decision Process (MDP) of complete information that can be easily used for further analysis.

#### **5 Case Study**

In this section, we consider a case study where a human operator supervises a UAV prone to stealthy attacks on its GPS sensor. The UAV mission is to visit a number of targets after being airborne from a known base (initial state), while avoiding hazard zones that are known a priori. Moreover, the presence of adversarial stealthy attacks via GPS spoofing is assumed. We use the DAG framework to synthesize strategies for both the UAV and an operator advisory system (AS) that schedules geolocation tasks for the operator.

**Modeling.** We model the system as a delayed-action game GD, where PL<sup>I</sup> and PLII represent the adversary and the UAV-AS coalition, respectively. Figure 9 shows the model primary and auxiliary components. In the UAV model Muav, x<sup>B</sup> = (xB, yB) encodes the UAV belief, and Auav = {N, S, E, W, NE, NW, SE, SW} is the set of available movements. The AS can trigger the action *activate* to initiate a geolocation task, attempting to confirm the current location. The adversary behavior is abstracted by Madv where x<sup>T</sup> = (x<sup>T</sup> , y<sup>T</sup> ) encodes the UAV true location. The adversarial actions are limited to one directional

<sup>6</sup> Failing to find a strategy at stage i implies the same for all horizons of size j>i.

**Fig. 9.** Primary DAG components: UAV (Muav), adversary (Madv), and AS (Mas). Auxiliary DAG components: memory write (Mmwr) and memory read (Mmrd) models, capturing the DAG representation. At stage i, the next memory location to write/read is mi.

increment at most.<sup>7</sup> If, for example, the UAV is heading N, then the adversary set of actions is β(N)={N, NE, NW}. The auxiliary components Mmwr and Mmrd manage a FIFO memory stack (mi) n−1 <sup>i</sup>=0 <sup>∈</sup> <sup>A</sup><sup>n</sup> uav. The last UAV movement is saved in m<sup>i</sup> by synchronizing Mmwr with Muav via *write*, while Mmrd synchronizes with Madv via *read* to read the next UAV action from m<sup>j</sup> . The subgame terminates whenever action *write* is attempted and Mmwr is at state n (i.e., out of memory).

The goal is to find strategies for the UAV-AS coalition based on the following:

– *Target reachability.* To overcome cases where targets are unreachable due to hazard zones, the label *reach* is assigned to the set of states with acceptable checkpoint locations (including the target) to render the objective incremen-

<sup>7</sup> To detect aggressive attacks, techniques from literature (e.g., [16,25,26]) can be used.

tally feasible. The objective for all encountered subgames is then formalized as Prmax [F *reach*] pmin for some bound pmin.

– *Hazard Avoidance.* Similar to target reachability, the label *hazard* is assigned to states corresponding to hazard zones. The objective Prmax [G ¬*hazard*] pmin is then specified for all encountered subgames.

By refining the aforementioned objectives, synthesis queries are used for both the subgames and the supergame. Specifically, the query

$$\phi\_{\text{syn}}(k) \coloneqq \langle \langle \text{uav} \rangle \rangle \text{Pr}\_{\text{max}=?} \left[ \neg hazard \, \mathsf{U}^{\leqslant k} \, \left( \langle locate \wedge reach \rangle \right) \right] \tag{1}$$

is specified for each encountered subgame <sup>G</sup>ˆi, where *locate* indicates a successful geolocation task. By following Algorithm 2 for a q number of reachable subgames, the supergame is reduced to an MDP <sup>G</sup>{πi}<sup>q</sup> i=1 <sup>D</sup> (whose states are the reachable subgames), which is checked against the query

$$\phi\_{\text{ana}}(n) \coloneqq \langle \langle \text{adv} \rangle \rangle \text{Pr}\_{\text{min}, \text{max}=?} \left[ \mathsf{F}^{\leqslant n} \ target \right] \tag{2}$$

to find the bounds on the probability that the target is reached under a maximum number of geolocation tasks n.

**Experimental Results.** Figure 10(a) shows the map setting used for implementation. The UAV's ability to actively detect an attack depends on both its belief and the ground truth. Specifically, the probability of success in a geolocation task mainly relies on the disparity between the belief and true locations, captured by *f*dis : *Ev* (xB) × *Ev* (x<sup>T</sup> ) → [0, 1], obtained by assigning probabilities for each pair of locations according to their features (e.g., landmarks) and smoothed using a Gaussian 2D filter. A thorough experimental analysis where probabilities are extracted from experiments with human operators is described in [11]. The set of hazard zones include the map boundaries to prevent the UAV from reaching boundary values. Also, the adversary is prohibited from launching attacks for at least the first step, a practical assumption to prevent the UAV model from infinitely bouncing around the target location.

We implemented the model in PRISM-games [7,19] and performed the experiments on an Intel Core i7 4.0 GHz CPU, with 10 GB RAM dedicated to the tool. Figure 10(b) shows the supergame obtained by following the procedure in Algorithm 2. A vertex <sup>G</sup>ˆxy represents a subgame (composed with its strategy) that starts at location (x, y), while the outgoing edges points to subgames reachable from the current one. Note that each edge represents a probabilistic transition. Subgames with more than one outgoing transition imply nondeterminism that is resolved by the adversary actions. Hence, the directed graph depicts an MDP.

The synthesized strategy for (hadv = 2, h = 4) is demonstrated in Fig. 10(c). For the initial subgame, Fig. 11(a) shows the maximum probability of a successful geolocation task if performed at stage h, and the remaining distance to target. Assuming the adversary can launch attacks after stage hadv = 2, the detection probability is maximized by performing the geolocation task at step 4,

**Fig. 10.** (a) The environment setup used for the case study; (b) the induced supergame MDP, where the subgames form its states; and (c) the synthesized protocols.

and hazard areas can still be avoided up till h = 6. For hadv = 1, however, h = 3 has the highest probability of success, which diminishes at h = 6 as no possible flight plan exists without encountering a hazard zone. The effect of the maximum number of geolocation tasks (n) on target reachability is studied by analyzing the supergame against φana as shown in Fig. 11(b). The minimum number of geolocation tasks to guarantee a non-zero probability of reaching the target (regardless of the adversary strategy) is 3 with probability bounds of (33.7%, 94.4%).

**Fig. 11.** Analysis results for (a) subgame <sup>G</sup>ˆ<sup>51</sup> and (b) supergame <sup>G</sup>D.

The experimental data obtained for this case study are listed in Table 1. For the same grid size, more complex maps require more time for synthesis while the state space size remains unaffected. The state space grows exponentially with the explored horizon size, i.e., O (|Auav||Aadv|)<sup>h</sup> , and is typically slowed by, e.g., the presence of hazard areas, since the branches of the game transitions are trimmed upon encountering such areas. Interestingly, for h = 6 and h = 7, while the model construction time (size) for hadv = 1 is almost twice (quadruple) as those for hadv = 2, the time for checking φsyn declines in comparison. This reflects the fact that, in case of hadv = 1 compared to hadv = 2, the UAV has higher chances to reach a hazard zone for the same k, leading to a shorter time for model checking.


**Table 1.** Results for strategy synthesis using queries φsyn and φana.

#### **6 Discussion and Conclusion**

In this paper, we introduced DAGs and showed how they can simulate HIGs by delaying players' actions. We also derived a DAG-based framework for strategy synthesis and analysis using off-the-shelf SMG model checkers. Under some practical assumptions, we showed that DAGs can be decomposed into independent subgames, utilizing parallel computation to reduce the time needed for model analysis, as well as the size of the state space. We further demonstrated the applicability of the proposed framework on a case study focused on synthesis and analysis of active attack detection strategies for UAVs prone to cyber attacks.

DAGs come at the cost of increasing the total state space size as Mmrd and Mmwr are introduced. This does not present a significant limitation due to the compositional approach towards strategy synthesis using subgames. However, the synthesis is still limited to model sizes that off-the-shelf tools can handle.

The concept of delaying actions implicitly assumes that the adversary knows the UAV actions a priori. This does not present a concern in the presented case study as an abstract (i.e., nondeterministic) adversary model is analogous to synthesizing against the worst-case attacking scenario. Nevertheless, strategies synthesized using DAGs (and SMGs in general) are inherently conservative. Depending on the considered system, this can easily lead to no feasible solution.

The proposed synthesis framework ensures preservation of safety properties. Yet, general reward-based strategy synthesis is to be approached with care. For example, rewards dependent on the belief can appear in any state, and exploring hypothetical branches is not required. However, rewards dependent on a state's true value should only appear in proper states, and all hypothetical branches are to be explored. A detailed investigation of how various properties are preserved by DAGs, along with multi-objective synthesis, is a direction for future work.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Automated Hypersafety Verification**

Azadeh Farzan(B) and Anthony Vandikas

University of Toronto, Toronto, Canada azadeh@cs.toronto.edu

**Abstract.** We propose an automated verification technique for hypersafety properties, which express sets of valid interrelations between multiple finite runs of a program. The key observation is that constructing a proof for a small representative set of the runs of the product program (i.e. the product of the several copies of the program by itself), called a *reduction*, is sufficient to formally prove the hypersafety property about the program. We propose an algorithm based on a counterexampleguided refinement loop that simultaneously searches for a reduction and a proof of the correctness for the reduction. We demonstrate that our tool Weaver is very effective in verifying a diverse array of hypersafety properties for a diverse class of input programs.

#### **1 Introduction**

A hypersafety property describes the set of valid interrelations between multiple finite runs of a program. A k-safety property [7] is a program safety property whose violation is witnessed by at least k finite runs of a program. Determinism is an example of such a property: non-determinism can only be witnessed by two runs of the program on the same input which produce two different outputs. This makes determinism an instance of a 2-safety property.

The vast majority of existing program verification methodologies are geared towards verifying standard (1-)safety properties. This paper proposes an approach to automatically reduce verification of k-safety to verification of 1-safety, and hence a way to leverage existing safety verification techniques for hypersafety verification. The most straightforward way to do this is via *self-composition* [5], where verification is performed on k memory-disjoint copies of the program, sequentially composed one after another. Unfortunately, the proofs in these cases are often very verbose, since the full functionality of each copy has to be captured by the proof. Moreover, when it comes to automated verification, the invariants required to verify such programs are often well beyond the capabilities of modern solvers [26] even for very simple programs and properties.

The more practical approach, which is typically used in manual or automated proofs of such properties, is to compose k memory-disjoint copies of the program *in parallel* (instead of in sequence), and then verify some *reduced* program obtained by removing redundant traces from the program formed in the previous step. This parallel product program can have many such reductions. For example, the program formed from sequential self-composition is one such reduction of the parallel product program. Therefore, care must be taken to choose a "good" reduction that *admits a simple proof*. Many existing approaches limit themselves to a narrow class of reductions, such as the one where each copy of the program executes in lockstep [3,10,24], or define a general class of reductions, but do not provide algorithms with guarantees of covering the entire class [4,24].

We propose a solution that combines the search for a safety proof with the search for an appropriate reduction, in a counterexample-based refinement loop. Instead of settling on a single reduction in advance, we try to verify the entire (possibly infinite) set of reductions simultaneously and terminate as soon as some reduction is successfully verified. If the proof is not currently strong enough to cover at least one of the represented program reductions, then an appropriate set of counterexamples are generated that guarantee progress towards a proof.

Our solution is language-theoretic. We propose a way to represent sets of reductions using infinite tree automata. The standard safety proofs are also represented using the same automata, which have the desired closure properties. This allows us to check if a candidate proof is in fact a proof for one of the represented program reductions, with reasonable efficiency.

Our approach is not uniquely applicable to hypersafety properties of sequential programs. Our proposed set of reductions naturally work well for concurrent programs, and can be viewed in the spirit of reduction-based methods such as those proposed in [11,21]. This makes our approach particularly appealing when it comes to verification of hypersafety properties of concurrent programs, for example, proving that a concurrent program is deterministic. The parallel composition for hypersafety verification mentioned above and the parallel composition of threads inside the multi-threaded program are treated in a uniform way by our proof construction and checking algorithms. In summary:


#### **2 Illustrative Example**

We use a simple program Mult, that computes the product of two non-negative integers, to illustrate the challenges of verifying hypersafety properties and the type of proof that our approach targets. Consider the multiplication program in Fig. 1(i), and assume we want to prove that it is distributive over addition.


**Fig. 1.** Program Mult (i) and the parallel composition of three copies of it (ii).

In Fig. 1(ii), the parallel composition of Mult with two copies of itself is illustrated. The product program is formed for the purpose of proving distributivity, which can be encoded through the postcondition x<sup>1</sup> = x<sup>2</sup> + x3. Since a, b, and c are not modified in the program, the same variables are used across all copies. One way to prove Mult is distributive is to come up with an inductive invariant φijk for each location in the product program, represented by a triple of program locations (i, <sup>j</sup> , k), such that *true* =⇒ φ<sup>111</sup> and φ<sup>666</sup> =⇒ x<sup>1</sup> = x<sup>2</sup> + x3. The main difficulty lies in finding assignments for locations such as φ<sup>611</sup> that are points in the execution of the program where one thread has finished executing and the next one is starting. For example, at (6, 1, 1) we need the assignment φ<sup>611</sup> ← x<sup>1</sup> = (a + b) ∗ c which is non-linear. However, the program given in Fig. 1(ii) can be verified with simpler (linear) reasoning.

The program on the right is a semantically equivalent *reduction* of the full composition of Fig. 1(ii). Consider the program P = (Copy 1 || (Copy 2; Copy 3)). The program on the right is equivalent to a lockstep execution of the two parallel components of P. The validity of this reduction is derived from the fact that the statements in each thread are *independent* of the statements in the other. That is, reordering the statements of different threads in an execution leads to an equivalent execution. It is easy to see that x<sup>1</sup> = x<sup>2</sup> + x<sup>3</sup> is an invariant of both while loops in the reduced program, and therefore, linear reasoning is sufficient to *<sup>i</sup>*1 0, *<sup>i</sup>*2 0, *<sup>i</sup>*3 <sup>0</sup> *<sup>x</sup>*1 0, *<sup>x</sup>*2 0, *<sup>x</sup>*3 <sup>0</sup> **while** *<sup>i</sup>*2 *< a <sup>x</sup>*1 *<sup>x</sup>*1 <sup>+</sup> *<sup>c</sup> <sup>x</sup>*2 *<sup>x</sup>*2 <sup>+</sup> *<sup>c</sup> <sup>i</sup>*1 *<sup>i</sup>*1 + 1 *<sup>i</sup>*2 *<sup>i</sup>*2 + 1 **while** *<sup>i</sup>*3 *< b <sup>x</sup>*1 *<sup>x</sup>*1 <sup>+</sup> *<sup>c</sup> <sup>x</sup>*3 *<sup>x</sup>*3 <sup>+</sup> *<sup>c</sup> <sup>i</sup>*1 *<sup>i</sup>*1 + 1 *<sup>i</sup>*3 *<sup>i</sup>*3 + 1

prove the postcondition for this program. Conceptually, this reduction (and its soundness proof) together with the proof of correctness for the reduced program constitute a proof that the original program Mult is distributive. Our proposed approach can come up with reductions like this and their corresponding proofs fully automatically. Note that a lockstep reduction of the program in Fig. 1(ii) would not yield a solution for this problem and therefore the discovery of the right reduction is an integral part of the solution.

#### **3 Programs and Proofs**

A non-deterministic finite automaton (NFA) is a tuple A = (Q, Σ, δ, q0, F) where Q is a finite set of states, Σ is a finite alphabet, δ ⊆ Q × Σ × Q is the transition relation, q<sup>0</sup> ∈ Q is the initial state, and F ⊆ Q is the set of final states. A deterministic finite automaton (DFA) is an NFA whose transition relation is a function δ : Q × Σ → Q. The language of an NFA or DFA A is denoted L(A), which is defined in the standard way [18].

#### **3.1 Program Traces**

St denotes the (possibly infinite) set of *program states*. For example, a program with two integer variables has <sup>S</sup><sup>t</sup> <sup>=</sup> <sup>Z</sup> <sup>×</sup> <sup>Z</sup>. A⊆S<sup>t</sup> is a (possibly infinite) set of *assertions* on program states. Σ denotes a finite alphabet of program *statements*. We refer to a finite string of statements as a (program) *trace*. For each statement a ∈ Σ we associate a *semantics* a ⊆ St × St and extend -− to traces via (relation) composition. A trace x ∈ Σ<sup>∗</sup> is said to be *infeasible* if x(St) = ∅, where x(St) denotes the image of x under St. To abstract away from a particular program syntax, we define a *program* as a regular language of traces. The semantics of a program P is simply the union of the semantics of its traces -P = - <sup>x</sup>∈<sup>P</sup> x. Concretely, one may obtain programs as languages by interpreting their edge-labelled control-flow graphs as DFAs: each vertex in the control flow graph is a state, and each edge in the control flow graph is a transition. The control flow graph entry location is the initial state of the DFA and all its exit locations are final states.

#### **3.2 Safety**

There are many equivalent notions of program safety; we use non-reachability. A program P is *safe* if all traces of P are infeasible, i.e. -P(St) = ∅. Standard partial correctness specifications are then represented via a simple encoding. Given a precondition φ and a postcondition ψ, the validity of the Hoare-triple {φ}P{ψ} is equivalent to the safety of [φ]·P ·[¬ψ], where [] is a standard assume statement (or the singleton set containing it), and · is language concatenation.

*Example 3.1.* We use determinism as an example of how k-safety can be encoded in the framework defined thus far. If P is a program then determinism of P is equivalent to safety of [φ] · (P<sup>1</sup> - P2) · [¬φ] where P<sup>1</sup> and P<sup>2</sup> are copies of P operating on disjoint variables, is a shuffle product of two languages, and [φ] is an assume statement asserting that the variables in each copy of P are equal.

A *proof* is a finite set of assertions Π ⊆ A that includes *true* and *false*. Each Π gives rise to an NFA ΠNFA = (Π, St, δΠ,*true*, {*false*}) where δΠ(φpre, a) = {φpost | a(φpre) ⊆ φpost}. We abbreviate L(ΠNFA) as L(Π). Intuitively, L(Π) consists of all traces that can be proven infeasible using only assertions in Π. Thus the following proof rule is sound [12,13,17]:

$$\frac{\exists \boldsymbol{II} \subseteq \mathcal{A}. \boldsymbol{P} \subseteq \mathcal{L}(\boldsymbol{II})}{\boldsymbol{P} \text{ is safe}} \tag{\text{SAFE}}$$

When P ⊆ L(Π), we say that Π is a proof for P. A proof does not uniquely belong to any particular program; a single Π may prove many programs correct.

#### **4 Reductions**

The set of assertions used for a proof is usually determined by a particular language of assertions, and a safe program may not have a (safety) proof in that particular language. Yet, a subset of the program traces may have a proof in that assertion language. If it can be proven that the subset of program runs that have a safety proof are a faithful representation of all program behaviours (with respect to a given property), then the program is correct. This motivates the notion of *program reductions*.

**Definition 4.1 (semantic reduction).** *If for programs* P *and* P *,* P *is safe implies that* P *is safe, then* P *is a* semantic reduction *of* P *(written* P P*).*

The definition immediately gives rise to the following proof rule for proving program safety:

$$\frac{\exists P' \preceq P, \Pi \subseteq \mathcal{A}. P' \subseteq \mathcal{L}(\Pi)}{P \text{ is safe}} \tag{\text{SAFEendi}1}$$

This generic proof rule is not automatable since, given a proof Π, verifying the existence of the appropriate reduction is *undecidable*. Observe that a program is safe if and only if ∅ is a valid reduction of the program. This means that discovering a semantic reduction and proving safety are mutually reducible to each other. To have decidable premises for the proof rule, we need to formulate an easier (than proving safety) problem in discovering a reduction. One way to achieve this is by restricting the set of reductions under consideration from all reductions (given in Definition 4.1) to a proper subset which more amenable to algorithmic checking. Fixing a set R of (semantic) reductions, we will have the rule:

$$\frac{\exists P' \in \mathcal{R}. P' \subseteq \mathcal{L}(\boldsymbol{\Pi}) \qquad \forall P' \in \mathcal{R}. P' \preceq P}{P \text{ is safe}} \qquad \text{(SAFEED2)}$$

**Proposition 4.2.** *The proof rule* SafeRed2 *is sound.*

The core contribution of this paper is that it provides an algorithmic solution inspired by the above proof rule. To achieve this, two subproblems are solved: (1) Given a set R of reductions of a program P and a candidate proof Π, can we check if there exists a reduction P ∈ R which is covered by the proof Π? In Sect. 5, we propose a new semantic interpretation of an existing notion of infinite tree automata that gives rise to an algorithmic check for this step. (2) Given a program P, is there a general sound set of reductions R that be effectively represented to accommodate step (1)? In Sect. 6, we propose a construction of an effective set of reductions, representable by our infinite tree automata, using inspirations from existing partial order reduction techniques [15].

#### **5 Proof Checking**

Given a set of reductions R of a program P, and a candidate proof Π, we want to check if there exists a reduction P ∈ R which is covered by Π. We call this *proof checking*. We use tree automata to represent certain classes of languages (i.e sets of sets of strings), and then use operations on these automata for the purpose of proof checking.

The set Σ<sup>∗</sup> can be represented as an infinite tree. Each x ∈ Σ<sup>∗</sup> defines a path to a unique node in the tree: the root node is located at the empty string , and for all a ∈ Σ, the node located at xa is a child of the node located at x. Each node is then identified by the string labeling the path leading to it. A language <sup>L</sup> <sup>⊆</sup> <sup>Σ</sup><sup>∗</sup> (equivalently, <sup>L</sup> : <sup>Σ</sup><sup>∗</sup> <sup>→</sup> <sup>B</sup>) can consequently be represented as an infinite tree where the node at each x is labelled with a boolean value B ≡ (x ∈ L). An example is given in Fig. 2.

It follows that a set of languages is a set of infinite trees, which can be represented using automata over infinite trees. Looping Tree Automata (LTAs)

**Fig. 2.** Language {a} as an infinite tree.

are a subclass of B¨uchi Tree Automata where all states are accept states [2]. The class of Looping Tree Automata is closed under intersection and union, and checking emptiness of LTAs is decidable. Unlike B¨uchi Tree Automata, emptiness can be decided in linear time [2].

**Definition 5.1.** *A Looping Tree Automaton (LTA) over* <sup>|</sup>Σ|*-ary,* <sup>B</sup>*-labelled trees is a tuple* <sup>M</sup> = (Q, Δ, q0) *where* <sup>Q</sup> *is a finite set of states,* <sup>Δ</sup> <sup>⊆</sup> <sup>Q</sup>×B×(<sup>Σ</sup> <sup>→</sup> Q) *is the transition relation, and* q<sup>0</sup> *is the initial state.*

Intuitively, an LTA M = (Q, Δ, q0) performs a parallel and depth-first traversal of an infinite tree L while maintaining some local state. Execution begins at the root from state q<sup>0</sup> and non-deterministically picks a transition (q0,B, σ) ∈ Δ such that B matches the label at the root of the tree (i.e. B = ( ∈ L)). If no such transition exists, the tree is rejected. Otherwise, M recursively works on each child a from state q = σ(a) in parallel. This process continues infinitely, and L is accepted if and only if L is never rejected.

Formally, M's execution over a tree L is characterized by a *run* δ<sup>∗</sup> : Σ<sup>∗</sup> → Q where δ∗() = q<sup>0</sup> and (δ∗(x), x ∈ L, λa. δ∗(xa)) ∈ Δ for all x ∈ Σ∗. The set of languages accepted by M is then defined as L(M) = {L | ∃δ∗. δ<sup>∗</sup> is a run of M on L}.

**Theorem 5.2.** *Given an LTA* M *and a regular language* L*, it is decidable whether* ∃P ∈ L(M). P ⊆ L*.*

The proof, which appears in [14], reduces the problem to deciding whether L(M)∩P(L) = ∅. LTAs are closed under intersection and have decidable emptiness checks, and the lemma below is the last piece of the puzzle.

**Lemma 5.3.** *If* L *is a regular language, then* P(L) *is recognized by an LTA.*

*Counterexamples.* Theorem 5.2 effectively states that proof checking is decidable. For automated verification, beyond checking the validity of a proof, we require counterexamples to fuel the development of the proof when the proof does not check. Note that in the simple case of the proof rule safe, when <sup>P</sup> ⊆ L(Π) there exists a counterexample trace x ∈ P such that x /∈ L(Π).

With our proof rule SafeRed2, things get a bit more complicated. First, note that unlike the classic case (safe), where a failed proof check coincides with the non-emptiness of an intersection check (i.e. P ∩ L(Π) = ∅), in our case, a failed proof check coincides with the emptiness of an intersection check (i.e. R∩P(L(Π)) = ∅). The sets R and P(L(Π)) are both sets of languages. What does the witness to the emptiness of the intersection look like? Each language member of R contains at least one string that does not belong to any of the subsets of our proof language. One can collect all such witness strings to guarantee progress across the board in the next round. However, since LTAs can represent an infinite set of languages, one must take care not end up with an infinite set of counterexamples following this strategy. Fortunately, this will not be the case.

**Theorem 5.4.** *Let* M *be an LTA and let* L *be a regular language such that* P ⊆ L *for all* P ∈ L(M)*. There exists a finite set of counterexamples* C *such that, for all* P ∈ L(M)*, there exists some* x ∈ C *such that* x ∈ P *and* x /∈ L*.*

The proof appears in [14]. This theorem justifies our choice of using LTAs instead of more expressive formalisms such as B¨uchi Tree Automata. For example, the B¨uchi Tree Automaton that accepts the language {{x} | x ∈ Σ<sup>∗</sup>} would give rise to an infinite number of counterexamples with respect to the empty proof (i.e. Π = ∅). The finiteness of the counterexample set presents an alternate proof that LTAs are strictly less expressive than B¨uchi Tree Automata [27].

#### **6 Sleep Set Reductions**

We have established so far that (1) a set of assertions gives rise to a regular language proof, and (2) given a regular language proof and a set of program reductions recognizable by an LTA, we can check the program (reductions) against the proof. The last piece of the puzzle is to show that a useful class of program reductions can be expressed using LTAs.

Recall our example from Sect. 2. The reduction we obtain is sound because, for every trace in the full parallel-composition program, an equivalent trace exists in the reduced program. By equivalent, we mean that one trace can be obtained from the other by swapping independent statements. Such an equivalence is the essence of the theory of Mazurkiewicz traces [9].

We fix a reflexive symmetric *dependence relation* D ⊆ Σ×Σ. For all a, b ∈ Σ, we say that a and b are *dependent* if (a, b) ∈ D, and say they are *independent* otherwise. We define ∼<sup>D</sup> as the smallest congruence satisfying xaby ∼<sup>D</sup> xbay for all x, y ∈ Σ<sup>∗</sup> and independent a, b ∈ Σ. The closure of a language L ⊆ Σ<sup>∗</sup> with respect to ∼<sup>D</sup> is denoted [L]D. A language L is ∼D*-closed* if L = [L]D. It is worthwhile to note that all input programs considered in this paper correspond to regular languages that are ∼D-closed.

An equivalence class of ∼<sup>D</sup> is typically called a (Mazurkiewicz) trace. We avoid using this terminology as it conflicts with our definition of traces as strings of statements in Sect. 3.1. We assume D is *sound*, i.e. ab = ba for all independent a, b ∈ Σ.

**Definition 6.1 (**D**-reduction).** *A program* P *is a* D-reduction *of a program* P*, that is* P <sup>D</sup> P*, if* [P ]<sup>D</sup> = P*.*

Note that the equivalence relation on programs induced by ∼<sup>D</sup> is a refinement of the semantic equivalence relation used in Definition 4.1.

#### **Lemma 6.2.** *If* P <sup>D</sup> P *then* P P*.*

Ideally, we would like to define an LTA that accepts all D-reductions of a program P, but unfortunately this is not possible in general.

**Proposition 6.3 (corollary of Theorem 67 of** [9]**).** *For arbitrary regular languages* L1, L<sup>2</sup> ∈ Σ<sup>∗</sup> *and relation* D*, the proposition* ∃L <sup>D</sup> L1. L ⊆ L<sup>2</sup> *is undecidable.*

The proposition is decidable only when D is transitive, which does not hold for a semantically correct notion of independence for a parallel program encoding a k-safety property, since statements from the same thread are dependent and statements from different program copies are independent. Therefore, we have:

**Proposition 6.4.** *Assume* P *is a* ∼D*-closed program and* Π *is a proof. The proposition* ∃P <sup>D</sup> P. P ⊆ L(Π) *is undecidable.*

In order to have a decidable premise for proof rule SafeRed2 then, we present an approximation of the set of D-reductions, inspired by sleep sets [15]. The idea is to construct an LTA that recognizes *a class of* D*-reductions* of an input program P, whose language is assumed to be ∼D-closed. This automaton intuitively makes non-deterministic choices about what program traces to *prune* in favour of other ∼D-equivalent program traces for a given reduction. Different non-deterministic choices lead to different D-reductions.

Consider two statements a, b ∈ Σ where (a, b) ∈ D. Let x, y ∈ Σ<sup>∗</sup> and consider two program runs xaby and xbay. We know xbay = xaby. If the automaton makes a non-deterministic choice that the successors of xa have been explored, then the successors of xba need not be explored (can be pruned away) as illustrated in Fig. 3. Now assume (a, c) ∈ D, for some c ∈ Σ. When the node xbc is being explored, we can no longer safely ignore a-transitions, since the equality xbcay = xabcy is not guaranteed. Therefore, the a successor of xbc has to be explored. The nondeterministic choice of what child node to explore is modelled by a choice of order in which we explore each node's children. Different orders yield different reductions. Reductions are therefore characterized as an assignment R : Σ<sup>∗</sup> → Lin(Σ) from nodes to linear orderings on Σ, where (a, b) ∈ R(x) means we explore child xa after child xb.

**Fig. 3.** Exploring from x with sleep sets.

Given R : Σ<sup>∗</sup> → Lin(Σ), the *sleep set* sleepR(x) ⊆ Σ at node x ∈ Σ<sup>∗</sup> defines the set of transitions that can be ignored at x:

$$\text{sleep}\_R(\epsilon) = \emptyset \tag{1}$$

$$\text{sleep}\_R(xa) = (\text{sleep}\_R(x) \cup R(x)(a)) \mid D(a) \tag{2}$$

Intuitively, (1) no transition can be ignored at the root node, since nothing has been explored yet, and (2) at node x, the sleep set of xa is obtained by adding the transitions we explored before a (R(x)(a)) and then removing the ones that conflict with a (i.e. are related to a by D). Next, we define the nodes that are ignored. The set of ignored nodes is the smallest set ignore<sup>R</sup> : <sup>Σ</sup><sup>∗</sup> <sup>→</sup> <sup>B</sup> such that

$$x \in \text{ignore}\_R \implies xa \in \text{ignore}\_R \tag{1}$$

$$a \in \text{sleep}\_R(x) \implies xa \in \text{igonre}\_R \tag{2}$$

Intuitively, a node xa is ignored if (1) any of its ancestors is ignored (ignoreR(x)), or (2) a is one of the ignored transitions at node x (a ∈ sleepR(x)).

Finally, we obtain an actual reduction of a program P from a characterization of a reduction R by removing the ignored nodes from P, i.e. P \ ignoreR.

**Lemma 6.5.** *For all* R : Σ<sup>∗</sup> → Lin(Σ)*, if* P *is a* ∼D*-closed program then* P \ ignore<sup>R</sup> *is a* D*-reduction of* P*.*

The set of all such reductions is reduceD(P) = {P \ignore<sup>R</sup> | R : Σ<sup>∗</sup> → Lin(Σ)}.

**Theorem 6.6.** *For any regular language* P*,* reduceD(P) *is accepted by an LTA.*

Interestingly, every reduction in reduceD(P) is optimal in the sense that each reduction contains at most one representative of each equivalence class of ∼D.

**Theorem 6.7.** *Fix some* P ⊆ Σ<sup>∗</sup> *and* R : Σ<sup>∗</sup> → Lin(Σ)*. For all* (x, y) ∈ P \ ignoreR*, if* x ∼<sup>D</sup> y *then* x = y*.*

#### **7 Algorithms**

Figure 4 illustrates the outline of our verification algorithm. It is a counterexampleguided abstraction refinement loop in the style of [12, 13,17]. The key difference is that instead of checking whether some proof Π is a proof for the program P, it checks if there exists a

**Fig. 4.** Counterexample-guided refinement loop.

reduction of the program P that Π proves correct.

The algorithm relies on an oracle Interpolate that, given a finite set of program traces C, returns a proof Π , if one exists, such that C ⊆ L(Π ). In our tool, we use Craig interpolation to implement the oracle Interpolate. In general, since program traces are the simplest form of sequential programs (loop and branch free), any automated program prover that can handle proving them may be used.

The results presented in Sects. 5 and 6 give rise to the proof checking sub routine of the algorithm in Fig. 4 (i.e. the light grey test). Given a program DFA A<sup>P</sup> = (Q<sup>P</sup> ,Σ,δ<sup>P</sup> , q<sup>P</sup> <sup>0</sup>, F<sup>P</sup> ) and a proof DFA A<sup>Π</sup> = (QΠ,Σ,δΠ, qΠ<sup>0</sup>, FΠ) (obtained by determinizing ΠNFA), we can decide ∃P ∈ reduceD(L(A<sup>P</sup> )). P ⊆ L(AΠ) by constructing an LTA MPΠ for reduceD(L(A<sup>P</sup> )) ∩ P(L(AΠ)) and checking emptiness (Theorem 5.2).

#### **7.1 Progress**

The algorithm corresponding to Fig. 4 satisfies a weak progress theorem: none of the counterexamples from a round of the algorithm will ever appear in a future counterexample set. This, however, is not strong enough to guarantee termination. Alternatively, one can think of the algorithm's progress as follows. In each round new assertions are discovered through the oracle Interpolate, and one can optimistically hope that one can finally converge on an existing target proof Π∗. The success of this algorithm depends on two factors: (1) the counterexamples used by the algorithm belong to L(Π∗) and (2) the proof that Interpolate discovers for these counterexamples coincide with Π∗. The latter is a typical known wild card in software model checking, which cannot be guaranteed; there is plenty of empirical evidence, however, that procedures based on Craig Interpolation do well in approximating it. The former is a new problem for our refinement loop.

In a standard algorithm in the style of [12,13,17], the verification proof rule dictates that every program trace must be in L(Π∗). In our setting, we only require a subset (corresponding to some reduction) to be in L(Π∗). This means one cannot simply rely on program traces as *appropriate* counterexamples. Theorem 5.4 presents a solution to this problem. It ensures that we always feed Interpolate some counterexample from Π<sup>∗</sup> and therefore guarantee progress.

**Theorem 7.1 (Strong Progress).** *Assume a proof* Π<sup>∗</sup> *exists for some reduction* <sup>P</sup><sup>∗</sup> ∈ R *and* Interpolate *always returns some subset of* <sup>Π</sup><sup>∗</sup> *for traces in* L(Π∗)*. Then the algorithm will terminate in at most* |Π<sup>∗</sup>| *iterations.*

Theorem 7.1 ensures that the algorithm will never get into an infinite loop due to a bad choice of counterexamples. The condition on Interpolate ensures that divergence does not occur due to the wrong choice of assertions by Interpolate and without it any standard interpolation-based software model checking algorithm may diverge. The assumption that there exists a proof for a reduction of the program in the fixed set R ensures that the proof checking procedure can verify the target proof Π<sup>∗</sup> once it is reached. Note that, in general, a proof may exist for a reduction of the program which is not in R. Therefore, the algorithm is not complete with respect to all reductions, since checking the premises of SafeRed1 is undecidable as discussed in Sect. 4.

#### **7.2 Faster Proof Checking Through Antichains**

The state set of MPΠ, the intersection of program and proof LTAs, has size <sup>|</sup>Q<sup>P</sup> <sup>×</sup> <sup>B</sup> × P(Σ) <sup>×</sup> <sup>Q</sup>Π|, which is exponential in <sup>|</sup>Σ|. Therefore, even a linear emptiness test for this LTA can be computationally expensive. Antichains have been previously used [8] to optimize certain operations over NFAs that also suffer from exponential blowups, such as deciding universality and inclusion tests. The main idea is that these operations involve computing downwards-closed and upwards-closed sets according to an appropriate subsumption relation, which can be represented compactly as antichains. We employ similar techniques to propose a new emptiness check algorithm.

*Antichains.* The set of maximal elements of a set X with respect to some ordering relation is denoted max(X). The downwards-closure of a set X with respect to is denoted X. An antichain is a set X where no element of X is related (by ) to another. The maximal elements max(X) of a finite set X is an antichain. If X is downwards-closed then max(X) = X.

The emptiness check algorithm for LTAs from [2] computes the set of *inactive* states (i.e. states which generate an empty language) and checks if the initial state is inactive. The set of inactive states of an LTA M = (Q, Δ, q0) is defined as the smallest set inactive(M) satisfying

$$\frac{\forall (q, B, \sigma) \in \Delta, \exists a. \sigma(a) \in \text{inactive}(M)}{q \in \text{inactive}(M)} \tag{\text{INACIVE}}$$

Alternatively, one can view inactive(M) as the least fixed-point of a monotone (with respect to ⊆) function F<sup>M</sup> : P(Q) → P(Q) where

$$F\_M(X) = \{ q \mid \forall (q, B, \sigma) \in \Delta . \exists a. \sigma(a) \in X \}.$$

Therefore, inactive(M) can be computed using a standard fixpoint algorithm.

If inactive(M) is downwards-closed with respect to some *subsumption relation* () ⊆ Q × Q, then we need not represent all of inactive(M). The antichain max(inactive(M)) of maximal elements of inactive(M) (with respect to ) would be sufficient to represent the entirety of inactive(M), and can be exponentially smaller than inactive(M), depending on the choice of relation .

A trivial way to compute max(inactive(M)) is to first compute inactive(M) and then find the maximal elements of the result, but this involves doing strictly more work than the baseline algorithm. However, observe that if F<sup>M</sup> also preserves downwards-closedness with respect to , then

$$\begin{aligned} \max(\text{incactive}(M)) &= \max(\text{lfp}(F\_M)) \\ &= \max(\text{lfp}(F\_M \circ \lfloor - \rfloor \circ \text{max})) = \text{lfp}(\max \circ F\_M \circ \lfloor - \rfloor) \end{aligned}$$

That is, max(inactive(M)) is the least fixed-point of a function F max <sup>M</sup> : <sup>P</sup>(Q) → P(Q) defined as <sup>F</sup> max <sup>M</sup> (X) = max(FM(X)). We can calculate max(inactive(M)) efficiently if we can calculate F max <sup>M</sup> (X) efficiently, which is true in the special case of the intersection automaton for the languages of our proof P(L(Π)) and our program reduceD(P), which we refer to as MPΠ.

We are most interested in the state space of MPΠ, which is QPΠ = (Q<sup>P</sup> × <sup>B</sup> × P(Σ)) <sup>×</sup> <sup>Q</sup>Π. Observe that states whose <sup>B</sup> part is are always active:

**Lemma 7.2.** ((q<sup>P</sup> , , S), qΠ) ∈/ inactive(MPΠ) *for all* q<sup>P</sup> ∈ Q<sup>P</sup> *,* q<sup>Π</sup> ∈ QΠ*, and* S ⊆ Σ*.*

The state space can then be assumed to be QPΠ = (Q<sup>P</sup> × {⊥} × P(Σ)) × Q<sup>Π</sup> for the purposes of checking inactivity. The subsumption relation defined as the smallest relation PΠ satisfying

$$S \subseteq S' \implies ((q\_P, \perp, S), q\_{\Pi}) \subseteq\_{P\Pi} ((q\_P, \perp, S'), q\_{\Pi})$$

for all q<sup>P</sup> ∈ Q<sup>P</sup> , q<sup>Π</sup> ∈ QΠ, and S, S ⊆ Σ, is a suitable one since:

**Lemma 7.3.** FMPΠ *preserves downwards-closedness with respect to* PΠ*.* The function F max <sup>M</sup>PΠ is a function over relations

$$F\_{M\_{PH}}^{\max} \colon \mathcal{P}((Q\_P \times \{\bot\} \times \mathcal{P}(\Sigma)) \times Q\_{\Pi}) \to \mathcal{P}((Q\_P \times \{\bot\} \times \mathcal{P}(\Sigma)) \times Q\_{\Pi})$$

but in our case it is more convenient to view it as a function over functions

$$F\_{M\_{PI}}^{\max} : (Q\_P \times \{\bot\} \times Q\_{II} \to \mathcal{P}(\mathcal{P}(\Sigma))) \to (Q\_P \times \{\bot\} \times Q\_{II} \to \mathcal{P}(\mathcal{P}(\Sigma)))$$

Through some algebraic manipulation and some simple observations, we can define F max <sup>M</sup>PΠ functionally as follows.

**Lemma 7.4.** *For all* q<sup>P</sup> ∈ Q<sup>P</sup> *,* q<sup>Π</sup> ∈ QΠ*, and* X : Q<sup>P</sup> × {⊥} × Q<sup>Π</sup> → P(P(Σ))*,*

$$F\_{M\_{PH}}^{\max}(X)(q\_P, \bot, q\_{\varPi}) = \begin{cases} \{\Sigma\} & \text{if } q\_P \in F\_P \land q\_{\varPi} \notin F\_{\varPi} \\ \bigcap\_{R \in \mathcal{L}in(\varSigma)} \bigcup\_{\substack{a \in \Sigma\\ S \in X(q\_P', \bot, q\_{\varPi}')}} S' & \text{otherwise} \end{cases}$$

*where*

$$\begin{aligned} q'\_P &= \delta\_P(q\_P, a) & X \sqcap Y &= \max\{x \sqcap y \mid x \in X \wedge y \in Y\} \\ q'\_H &= \delta\_H(q\_H, a) & X \sqcup Y &= \max(X \cup Y) \end{aligned}$$

$$S' = \begin{cases} \{(S \cup D(a)) \mid \{a\}\} & \text{if } R(a) \nmid D(a) \subseteq S \\ \emptyset & & \text{otherwise} \end{cases}$$

$$\begin{array}{|c|c|}\hline \textbf{function} \,\textbf{Check} (A\_P, A\_H, D) \\ \hline \hline (Q\_P, \Sigma, \delta\_P, q\_{0P}, F\_P) \leftarrow A\_P \\ (Q\_H, \Sigma, \delta\_H, q\_{0H}, F\_H) \leftarrow A\_H \\ \textbf{function} \,\textbf{FMax} (\textbf{X}) ((q\_P, \bot, q\_{0})) \\ \textbf{if } q\_P \in F\_P \land q\_H \notin F\_H \\ \quad \textbf{return } \{\Sigma\} \\ X^\top \leftarrow \{\Sigma\} \\ X^\top \leftarrow \{\Sigma\} \\ \textbf{for } R \in \mathcal{L}in (\Sigma) \\ X^\bot \leftarrow \emptyset \\ \textbf{for } a \in \Sigma, S \in \mathsf{X} ((\delta\_P (q\_P, a), \bot, \delta\_H (q\_H, a))) \\ \quad \textbf{for } R(a) \supset D(a) \subseteq S \\ \quad \textbf{if } R(a) \mid D(a) \subseteq S \\ \quad X^\top \leftarrow X^\top \cup \{\left(S \cup D(a)\right) \mid \{a\}\} \\ X^\top \leftarrow X^\top \cap X^\bot \\ \textbf{return } X^\top \\ \textbf{return } X^\top \\ \textbf{return } X^\top \\ \textbf{return } X^\top \\ \textbf{return } X^\top \\ \textbf{return } X^\top \\ \textbf{return } X^\top \\ \end{array}$$

A full justification appears in [14]. Formulating F max <sup>M</sup>PΠ as a higher-order function allows us to calculate max(inactive(MPΠ)) using efficient fixpoint algorithms like the one in [22]. Algorithm 1 outlines our proof checking routine. Fix : ((<sup>A</sup> <sup>→</sup> <sup>B</sup>) <sup>→</sup> (<sup>A</sup> <sup>→</sup> <sup>B</sup>)) <sup>→</sup> (<sup>A</sup> <sup>→</sup> <sup>B</sup>) is a procedure that computes the least fixpoint of its input. The algorithm simply computes the fixpoint of the function F max <sup>M</sup>PΠ as defined in Lemma 7.4, which is a compact representation of inactive(MPΠ) and checks if the start state of MPΠ is in it.

*Counterexamples.* Theorem 5.4 states that a finite set of counterexamples exists whenever ∃P ∈ reduceD(P). P ⊆ L(Π) does not hold. The proof of emptiness for an LTA, formed using rule Inactive above, is a finite tree. Each edge in the tree is labelled by an element of Σ (obtained from the existential in the rule) and the paths through this tree form the counterexample set. To compute this set, then, it suffices to remember enough information during the computation of inactive(M) to reconstruct the proof tree. Every time a state q is determined to be inactive, we must also record the witness a ∈ Σ for each transition (q, B, σ) ∈ Δ such that σ(a) ∈ inactive(M).

In an antichain-based algorithm, once we determine a state q to be inactive, we simultaneously determine everything it subsumes (i.e. q) to be inactive as well. If we record unique witnesses for each and every state that q subsumes, then the space complexity of our antichain algorithm will be the same as the unoptimized version. The following lemma states that it is sufficient to record witnesses only for q and discard witnesses for states that q subsumes.

**Lemma 7.5.** *Fix some states* q, q *such that* q PΠ q*. A witness used to prove* q *is inactive can also be used to prove* q *is inactive.*

Note that this means that the antichain algorithm soundly returns potentially fewer counterexamples than the original one.

#### **7.3 Partition Optimization**

The LTA construction for reduceD(P) involves a nondeterministic choice of linear order at each state. Since |Lin(Σ)| has size |Σ|!, each state in the automaton would have a large number of transitions. As an optimization, our algorithm selects ordering relations out of Part(Σ) (instead of Lin(Σ)), defined as Part(Σ) = {Σ<sup>1</sup> × Σ<sup>2</sup> | Σ<sup>1</sup> Σ<sup>2</sup> = Σ} where is disjoint union. This leads to a sound algorithm which is not complete with respect to sleep set reductions and trades the factorial complexity of computing Lin(Σ) for an exponential one.

#### **8 Experimental Results**

To evaluate our approach, we have implemented our algorithm in a tool called Weaver written in Haskell. Weaver accepts a program written in a simple imperative language as input, where the property is already encoded in the program in the form of *assume* statements, and attempts to prove the program correct. The dependence relation for each input program is computed using a heuristic that ensures ∼D-closedness. It is based on the fact that the shuffle product (i.e. parallel composition) of two ∼D-closed languages is ∼D-closed.

Weaver employs two verification algorithms: (1) The total order algorithm presented in Algorithm 1, and (2) the variation with the partition optimization discussed in Sect. 7.3. It also implements multiple counterexample generation algorithms: (1) *Naive:* selects the first counterexample in the difference of the program and proof language. (2) *Progress-Ensuring:* selects a set of counterexamples satisfying Theorem 5.4. (3) *Bounded Progress-Ensuring:* selects a few counterexamples (in most cases just one) from the set computed by the progressensuring algorithm. Our experimentation demonstrated that in the vast majority of the cases, the bounded progress ensuring algorithm (an instance of the partition algorithm) is the fastest of all options. Therefore, all our reports in this section are using this instance of the algorithm.

For the larger benchmarks, we use a simple sound optimization to reduce the proof size. We declare the basic blocks of code as atomic, so that internal assertions need not be generated for them as part of the proof. This optimization is incomplete with respect to sleep set reductions.

*Benchmarks.* We use a set of sequential benchmarks from [24] and include additional sequential benchmarks that involve more interesting reductions in their proofs. We have a set of parallel benchmarks, which are beyond the scope of previous hypersafety verification techniques. We use these benchmarks to demonstrate that our technique/tool can seamlessly handle concurrency. These involve proving concurrency specific hypersafety properties such as determinism and equivalence of parallel and sequential implementations of algorithms. Finally, since the proof checking algorithm is the core contribution of this paper, we have a contrived set of instances to stress test our algorithm. These involve proving determinism of simple parallel-disjoint programs with various numbers of threads and statements per thread. These benchmarks have been designed to cause a combinatorial explosion for the proof checker and counterexample generation routines. More information on the benchmarks can be found in [14].

#### **Evaluation**

Due to space restrictions, it is not feasible to include a detailed account of all our experiments here, for over 50 benchmarks. A detailed table can be found in [14]. Table 1 includes a summary in the form of averages, and here, we discuss our top findings.

**Proof construction time** refers to the time spent to construct L(Π) from a given set of assertions Π and excludes the time to produce proofs for the counterexamples in a given round. **Proof checking time** is the time spent to check if the current proof candidate is strong enough for a reduction of the program. In the fastest instances (total time around 0.01 s), roughly equal time is spent in proof checking and proof construction. In the slowest instances, the total time is almost entirely spent in proof construction. In contrast, in our stress


**Table 1.** Experimental results averages for benchmark groups.

tests (designed to stress the proof checking algorithm) the majority of the time is spent in proof checking. The time spent in proving counterexamples correct is negligible in all instances. **Proof sizes** vary from 4 assertions to 298 for the most complicated instance. Verification times are *correlated* with the final proof size; larger proofs tend to cause longer verification times.

**Numbers of refinement rounds** vary from 2 for the simplest to 33 for the most complicated instance. A small number of refinement rounds (e.g. 2) implies a fast verification time. But, for the higher number of rounds, a strong positive correlation between the number of rounds and verification time does not exist.

For our **parallel programs** benchmarks (other than our stress tests), the tool spends the majority of its time in proof construction. Therefore, we designed specific (unusual) parallel programs to stress test the proof checker. **Stress test** benchmarks are trivial tests of determinism of disjoint parallel programs, which can be proven correct easily by using the atomic block optimization. However, we force the tool to do the unnecessary hard work. These instances simulate the worst case theoretical complexity where the proof checking time and number of counterexamples grow exponentially with the number of threads and the sizes of the threads. In the largest instance, more than 99% of the total verification time is spent in proof checking. Averages are not very informative for these instances, and therefore are not included in Table 1.

Finally, Weaver is only slow for verifying 3-safety properties of large looping benchmarks from [24]. Note that unlike the approach in [24], which starts from a default lockstep reduction (that is incidentally sufficient to prove these instances), we do not assume any reduction and consider them all. The extra time is therefore expected when the product programs become quite large.

#### **9 Related Work**

The notion of a k-safety hyperproperty was introduced in [7] without consideration for automatic program verification. The approach of reducing k-safety to 1-safety by self-composition is introduced in [5]. While theoretically complete, self-composition is not practical as discussed in Sect. 1. Product programs generalize the self-composition approach and have been used in verifying translation validation [20], non-interference [16,23], and program optimization [25]. A product of two programs P<sup>1</sup> and P<sup>2</sup> is semantically equivalent to P<sup>1</sup> · P<sup>2</sup> (sequential composition), but is made easier to verify by allowing parts of each program to be interleaved. The product programs proposed in [3] allow lockstep interleaving exclusively, but only when the control structures of P<sup>1</sup> and P<sup>2</sup> match. This restriction is lifted in [4] to allow some non-lockstep interleavings. However, the given construction rules are non-deterministic, and the choice of product program is left to the user or a heuristic.

Relational program logics [6,28] extend traditional program logics to allow reasoning about relational program properties, however automation is usually not addressed. Automatic construction of product programs is discussed in [10] with the goal of supporting procedure specifications and modular reasoning, but is also restricted to lockstep interleavings. Our approach does not support procedure calls but is fully automated and permits non-lockstep interleavings.

The key feature of our approached is the automation of the discovery of an appropriate program reduction and a proof combined. In this case, the only other method that compares is the one based on Cartesian Hoare Logic (CHL) proposed in [24] along with an algorithm for automatic verification based on CHL. Their proposed algorithm implicitly constructs a product program, using a heuristic that favours lockstep executions as much as possible, and then prioritizes certain rules of the logic over the rest. The heuristic nature of the search for the proof means that no characterization of the search space can be given, and no guarantees about whether an appropriate product program will be found. In contrast, we have a formal characterization of the set of explored product programs in this paper. Moreover, CHL was not designed to deal with concurrency.

Lipton [19] first proposed reduction as a way to simplify reasoning about concurrent programs. His ideas have been employed in a semi-automatic setting in [11]. Partial-order reduction (POR) is a class of techniques that reduces the state space of search by removing redundant paths. POR techniques are concerned with finding a single (preferably minimal) reduction of the input program. In contrast, we use the same underlying ideas to explore many program reductions simultaneously. The class of reductions described in Sect. 6 is based on the sleep set technique of Godefroid [15]. Other techniques exist [1,15] that are used in conjunction with sleep sets to achieve minimality in a normal POR setting. In our setting, reductions generated by sleep sets are already optimal (Theorem 6.7). However, employing these additional POR techniques may propose ways of optimizing our proof checking algorithm by producing a smaller reduction LTA.

#### **References**

1. Abdulla, P.A., Aronis, S., Jonsson, B., Sagonas, K.: Source sets: a foundation for optimal dynamic partial order reduction. J. ACM (JACM) **64**(4), 25 (2017)


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Automated Synthesis of Secure Platform Mappings**

Eunsuk Kang1(B) , Stephane Lafortune ´ 2, and Stavros Tripakis3

<sup>1</sup> Carnegie Mellon University, Pittsburgh, USA eskang@cmu.edu <sup>2</sup> University of Michigan, Ann Arbor, USA stephane@umich.edu <sup>3</sup> Northeastern University, Boston, USA stavros@northeastern.edu

**Abstract.** System development often involves decisions about how a high-level design is to be implemented using primitives from a low-level platform. Certain decisions, however, may introduce undesirable behavior into the resulting implementation, possibly leading to a violation of a desired property that has already been established at the design level. In this paper, we introduce the problem of *synthesizing a property-preserving platform mapping*: synthesize a set of implementation decisions ensuring that a desired property is preserved from a highlevel design into a low-level platform implementation. We formalize this synthesis problem and propose a technique for generating a mapping based on symbolic constraint search. We describe our prototype implementation, and two real-world case studies demonstrating the applicability of our technique to the synthesis of secure mappings for the popular web authorization protocols OAuth 1.0 and 2.0.

#### **1 Introduction**

When building a complex software system, one may begin by coming up with an abstract design, and then construct an implementation that conforms to this design. In practice, there are rarely enough time and resources available to build an implementation from scratch, and so this process often involves reuse of an existing *platform*—a collection of generic components, data structures, and libraries that are used to build an application in a particular domain.

The benefits of reuse also come with potential risks. A typical platform exhibits its own complex behavior, including subtle interactions with the environment that may be difficult to anticipate and reason about. Typically, the developer must work with the platform as it exists, and is rarely given the luxury of being able to modify it and remove unwanted features. For example, when building a web application, a developer must work with a standard browser and take into account all its features and security vulnerabilities. As a result, achieving an implementation that perfectly conforms to the design—in the traditional notion of behavioral refinement [20]—may be too difficult

This work has been supported by NSF award CNS-1801546.

in practice. Worse, the resulting implementation may not necessarily preserve desirable properties that have already been established at the level of design.

These risks are especially evident in applications where security is a major concern. For example, OAuth 2.0, a popular authorization protocol subjected to rigorous and formal analysis at an abstract level [9,33,42], has been shown to be vulnerable to attacks when implemented on a web browser or a mobile device [10,39,41]. Many of these vulnerabilities are not due to simple programming errors: They arise from logical flaws that involve a subtle interaction between the protocol logic and the details of the underlying platform. Unfortunately, OAuth itself does not explicitly guard against these flaws, since it is intended to be a *generic*, *abstract* protocol that deliberately omits details about potential platforms. On the other hand, anticipating and mitigating against these risks require an in-depth understanding of the platform and security expertise, which many developers do not possess.

This paper proposes an approach to help developers overcome these risks and achieve an implementation that preserves desired properties. In particular, we formulate this task as the problem of automatically synthesizing a *property-preserving platform mapping*: A set of implementation decisions ensuring that a desired property is preserved from a high-level design into a low-level platform implementation.

Our approach builds on the prior work of Kang et al. [28], which proposes a modeling and verification framework for reasoning about security attacks across multiple levels of abstraction. The central notion in this framework is that of a *mapping*, which captures a developer's decisions about how abstract system entities are to be realized in terms of their concrete counterparts. In this paper, we fix a bug in the formalization of mapping in [28] and extend the framework of [28] with the novel problem of synthesizing a property-preserving mapping. In addition, we present an algorithmic technique for performing this synthesis task. Our technique, inspired by the highly successful paradigms of *sketching* and *syntax-guided synthesis* [3,26,37,38], takes a *constraint generalization* approach to (1) quickly prune the search space and (2) produce a solution that is *maximal* (i.e., a largest set of mappings that preserve a given property).

We have built a prototype implementation of the synthesis technique. Our tool accepts a high-level design model, a desired system property (both specified by the developer), and a model of a low-level platform (built and maintained separately by a domain expert). The tool then produces a maximal set of mappings (if one exists) that would ensure that the resulting platform implementation preserves the given property. We have successfully applied our tool to synthesize property-preserving mappings for two non-trivial case studies: the authentication protocols OAuth 1.0 and 2.0 implemented on top of HTTP. Our results are promising: The implementation decisions captured by our synthesized mappings describe effective mitigations against some of the common vulnerabilities that have been found in deployed OAuth implementations [39,41].

The contributions of this paper include: a formal treatment of mapping, including a correction in the original definition [28] (Sect. 2); a formulation of the *mapping synthesis problem*, a novel approach for ensuring the preservation of a property between a high-level design and its platform implementation (Sect. 3); a technique for automatically synthesizing mappings based on symbolic constraint search (Sect. 4); and a prototype implementation of the synthesis technique along with a real-world case study demonstrating the feasibility of this approach (Sect. 5). We conclude with a discussion of related work (Sect. 6).

#### **2 Mapping Composition**

Our approach builds on the modeling and verification framework proposed by Kang et al. [28], which is designed to allow modular reasoning about behavior of processes across multiple abstraction layers. In this framework, a trace-based semantic model (based on CSP [21]) is extended to represent events as *sets of labels*, and includes a new composition operator based on the notion of *mappings*, which relate event labels from one abstraction layer to another. In this section, we present the essential elements of this framework.

**Fig. 1.** A pair of high-level (abstract) and low-level (public) communication models. Note that each event is a *set* of labels, where each label describes one possible representation of the event.

**Running Example.** Consider a simple example involving communication of messages among a set of processes. In our modeling approach, the communication of a message is represented by labels of the form *sender*.*receiver*.*message*. For example, label a.e.p represents Alicesending Eve a public, non-secret message. Similarly, a.b.s represents Alicesending a secret message to another process (b for Bob, for example). In this system, Aliceis unwilling to share its secret with Eve; in Fig. 1(a), this is modeled by the absence of any transition on event {a.e.s} in the Aliceprocess.

Eve is a malicious character whose goal is to learn Alice's secret. Beside a.e.p and a.e.s, Eve is associated with two additional labels, u.e.p and u.e.s, which represent receiving a public or secret message, respectively, through some *unknown* sender u. Conceptually, these two latter labels can be regarded as *side channels* [30] that Eve uses to obtain information.

A desirable property of this abstract communication system is that Eve should never be able to learn Alice's secret1. In this case, it can be easily observed that the property holds, since Alice, by design, never sends the secret to Eve.

<sup>1</sup> A formalization of this property is provided later in this section.

The model in Fig. 1(b) describes communication over a low-level public channel that is shared among all processes. A message sent over this channel may be encrypted using a key, as captured by labels of the form *message*.*key*. For instance, p.x and s.x represent the transmission of a public and secret message, respectively, using key x. A message may also be sent in plaintext by omitting an encryption key (e.g., label s represents the plaintext transmission of a secret). Each receiver on the public channel is assumed to have knowledge of only a single key; for instance, RecvX only knows key x and thus cannot receive messages that are encrypted using key y (i.e., labels p.y and s.y do not appear in events of RecvX).

Suppose that we wish to reason about the behavior of the abstract communication system from Fig. 1(a) when it is implemented over the public channel in Fig. 1(b). In particular, in the low-level implementation, Eve and other processes (e.g., Bob) are required to share the same channel, no longer benefitting from the separation provided by the abstraction in Fig. 1(a). Does the property of the abstract communication hold in every possible implementation? If not, which decisions ensure that Alice's secret remains protected from Eve? We formulate these questions as the problem of synthesizing a *property-preserving mapping* between a pair of high-level and low-level models.

**Events, Traces, and Processes.** Let *L* be a potentially infinite set of labels. An *event e* is a finite, non-empty set of labels: *e* ∈ *E*(*L*), where *E*(*L*) is the set of all finite subsets of *L* except the empty set ∅. Let *S*<sup>∗</sup> be the set of all finite sequences of elements of set *S*. A *trace t* is a finite sequence of events: *t* ∈ *T*(*L*), where *T*(*L*) is the set of all traces over *L* (i.e., *T*(*L*)=(*E*(*L*))∗). The empty trace is denoted by , and the trace consisting of a sequence of events *e*1, *e*2, ... is denoted *e*1, *e*2, .... If *t* and *t* are traces, then *t* · *t* is the trace obtained by concatenating *t* and *t* . Note that · *t* = *t* · = *t* for any trace *t*.

Let *t* be a trace over set of labels *L*, and let *A* ⊆ *L* be a subset of *L*. The *projection of t onto A*, denoted *t* -*A*, is defined as follows:

$$\langle \rangle \mid \vdash A = \langle \rangle \qquad (\langle e \rangle \cdot t) \restriction A = \begin{cases} \langle e \cap A \rangle \cdot (t \restriction A) \text{ if } e \cap A \neq \emptyset \\ (t \restriction A) & \text{otherwise} \end{cases}$$

For example, if *t* = {*a*}, {*a*, *c*}, {*b*}, then *t* - {*a*, *b*} = {*a*}, {*a*}, {*b*} and *t* - {*b*, *c*} = {*c*}, {*b*}.

A *process P* is defined as a triple (L*P*, E*P*, T*P*). The *labels* of process *P*, L*<sup>P</sup>* ⊆ *L*, is the set of all labels appearing in *P*, and E*<sup>P</sup>* ⊆ *E*(L) is the set of events that *may* appear in traces of *P*, which are denoted by T*<sup>P</sup>* ⊆ *T*(L). We assume traces in every process *P* to be *prefix-closed*; i.e., ∈ T*<sup>P</sup>* and for every non-empty trace *t* = *t* · *e* ∈ T*P*, *t* ∈ T*P*.

**Parallel Composition.** A pair of processes *P* and *Q* synchronize with each other by performing events *e*<sup>1</sup> and *e*2, respectively, if these two events share at least one label. In their parallel composition, denoted *P Q*, this synchronization is represented by a new event *e* that is constructed as the union of *e*<sup>1</sup> and *e*<sup>2</sup> (i.e., *e* = *e*<sup>1</sup> ∪ *e*2).

Formally, let *P* = (L*P*, E*P*, T*P*) and *Q* = (L*Q*, E*Q*, T*Q*) be a pair of processes. Their parallel composition is defined as follows:

$$\begin{aligned} E\_{P \parallel \mathcal{Q}} &= \{ e \in E(L\_P \cup L\_{\mathcal{Q}}) \mid eventCon(e, P) \land eventCon(e, \mathcal{Q}) \land syncCon(e) \} \\ T\_{P \parallel \mathcal{Q}} &= \{ t \in (E\_{P \parallel \mathcal{Q}})^{\*} \mid (t \upharpoonright L\_P) \in T\_P \land (t \upharpoonright L\_{\mathcal{Q}}) \in T\_{\mathcal{Q}} \} \end{aligned} \tag{\bf{Def. 1}}$$

where L*<sup>P</sup><sup>Q</sup>* = L*<sup>P</sup>* ∪ L*Q*, predicate *eventCond* is defined as

$$\text{eventCond}(e, P) \equiv e \cap L\_P = \emptyset \lor e \cap L\_P \in E\_P$$

and a condition on synchronization, *syncCond*, is defined as

*syncCond*(*e*) ≡ *e* ⊆ L*<sup>P</sup>* − L*<sup>Q</sup>* ∨ *e* ⊆ L*<sup>Q</sup>* − L*<sup>P</sup>* ∨ (∃ *a* ∈ *e* : *a* ∈ L*<sup>P</sup>* ∩ L*Q*) (**Cond. 1)**

The definition of T*<sup>P</sup><sup>Q</sup>* states that if we take a trace *t* in the composite process and ignore labels that appear only in *Q*, then the resulting trace must be a valid trace of *P* (and symmetrically for *Q*). The condition (**Cond. 1**) is imposed on every event appearing in T*<sup>P</sup><sup>Q</sup>* to ensure that an event performed together by *P* and *Q* contains at least one common label shared by both processes.

This type of parallel composition can be seen as a generalization of the parallel composition of CSP [21], from single labels to *sets* of labels. That is, the CSP parallel composition is the special case of the composition of **Def. 1** where every event is a singleton (i.e., it contains exactly one label). Note that if event *e* contains exactly one label *a*, then *a* must belong to the alphabet of *P* or that of *Q*, which means *syncCond*(*e*) always evaluates to true. The resulting expression in that case

$$T\_{P \parallel Q} = \{ t \in T(L\_P \cup L\_Q) \mid (t \upharpoonright L\_P) \in T\_P \land (t \upharpoonright L\_Q) \in T\_Q \} $$

is equivalent to the definition of parallel composition in CSP [21, Sec. 2.3.3].

**Mapping Composition.** A *mapping m over set of labels L* is a partial function *m* : *L* → *L*. Informally, *m*(*a*) = *b* stipulates that every event that contains *a* as a label is to be assigned *b* as an additional label. We sometimes use the notations *a* →*<sup>m</sup> b* or (*a*, *b*) ∈ *m* as alternatives to *m*(*a*) = *b*. When we write *m*(*a*) = *b* we mean that *m*(*a*) is defined and is equal to *b*. The *empty* mapping, denoted *m* = ∅, is the partial function *m* : *L* → *L* which is undefined for all *a* ∈ *L*.

*Mapping composition* allows a pair of processes to interact with each other over distinct labels. Formally, consider two processes *P* = (L*P*, E*P*, T*P*) and *Q* = (L*Q*, E*Q*, T*Q*), and let *L* = L*<sup>P</sup>* ∪ L*Q*. Given mapping *m* : *L* → *L*, the *mapping composition P mQ* is defined as follows:

$$E\_{P\|\pi\_{\mathcal{Q}}} = \{ e \in E(L\_P \cup L\_{\mathcal{Q}}) \mid \text{event} \text{Cond}(e, P) \land \text{event} \text{Cond}(e, \mathcal{Q}) \land \\ \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \} \qquad \}$$

where L*<sup>P</sup>mQ* = L*<sup>P</sup>* ∪ L*Q*, and *syncCond* (*e*) and *mapCond*(*e*, *m*) are defined as:

$$\begin{aligned} &\forall smCond'(e) \equiv \mathit{sync}Con(e) \lor (\exists \, a \in e \cap L\_P, \exists \, b \in e \cap L\_Q : m(a) = b \lor m(b) = a) \\ &\quad \forall mpCond(e, m) \equiv (\forall a \in e : a \in dom(m) \Rightarrow m(a) \in e) \end{aligned}$$

where *dom*(*m*) is the domain of function *m*. Compared to **Def. 1**, the additional disjunct in *syncCond* (*e*) allows *P* and *Q* to synchronize even when they do not share any label, if at least one pair of their labels are mapped to each other in *m*. The predicate *mapCond* ensures that if an event *e* contains a label *a* and *m* is defined over *a*, then *e* also contains the label that *a* is mapped to.

Note that **Def. 2** is different from the definition of mapping composition in [28], and corrects a flaw in the latter. In particular, the definition in [28] omits condition *syncCond* , which permits the undesirable case in which events *e*<sup>1</sup> and *e*<sup>2</sup> from *P* and *Q* are synchronized into union *e* = *e*<sup>1</sup> ∪ *e*<sup>2</sup> even when the events do not share any label.

**Example.** Let *P* and *Q* be the abstract and public channel communication models from Fig. 1(a) and (b), respectively. The property that Eve never learns Alice's secret can be stated as follows:

$$\Phi \equiv \neg(\exists \; e \in E(L) : l\_1, l\_2 \in e : l\_1 = \mathbf{a} . ^\star \mathbf{s} \land l\_2 = ^\star \mathbf{e} . \mathbf{s})$$

where \* ∈ {*a*, *b*, *e*, *u*}. In other words, Eve should never be able to engage in an event that involves the transmission of Alice's secret. From Fig. 1(a), it can be observed that *P* = Alice Eve |= Φ.

Suppose that we decide on a simple implementation scheme where the abstract messages sent by Aliceare transmitted over the public channel in plaintext; this decision can be encoded as a mapping, *m*1, where each abstract label (i.e., *L*Alice in Fig. 1(c)) is mapped to concrete label p or s as follows:

$$\texttt{a.b.p.}, \texttt{a.e.p.}, \texttt{u.e.p} \mapsto\_{m\_1} \texttt{p} \qquad \texttt{a.b.s.}, \texttt{a.e.s.}, \texttt{u.e.s} \mapsto\_{m\_1} \texttt{s}$$

The resulting implementation can be constructed as process *Im*<sup>1</sup> ≡ (Alice *<sup>m</sup>*<sup>1</sup> Sender) (Eve *<sup>m</sup>*1RecvX). Due to the definition of mapping composition (**Def. 2**), the following event may appear in a trace of the overall composite process:

$$\langle \{ \mathbf{a}. \mathbf{b}. \mathbf{s}, \mathbf{s}, \mathbf{a}. \mathbf{e}. \mathbf{s} \} \rangle \in T\_{I\_{m\_1}}$$

Note that this trace is a violation of the above property (i.e., *Im*<sup>1</sup> |= Φ). This can be seen as an example of *abstraction violation*: As a result of decisions in *m*1, a.b.s and u.e.s now share the same underlying representation (s), and Eve is able to engage in an event with a label (a.b.s) that was not previously available to it in the abstract model.

**Properties of the Mapping Composition Operator.** Mapping composition is a generalization of parallel composition: The latter is a special case of mapping composition where the given mapping is empty:

**Lemma 1.** *Given a pair of processes P and Q, if m* = ∅ *then P mQ* = *P Q.*

**Commutativity.** The proposed mapping composition operator is commutative: i.e., *P mQ* = *Q mP*. This property can be inferred from the fact that **Def. 2** is symmetric with respect to *P* and *Q*. It follows that by being a special case of mapping composition, the parallel composition operator is also commutative.

**Associativity.** The mapping composition operator is associative under the following conditions on the alphabets of involved processes and mappings:

**Theorem 1.** *Given processes P, Q, and R, let X* = (*P <sup>m</sup>*1*Q*) *<sup>m</sup>*<sup>2</sup> *R and Y* = *P <sup>m</sup>*<sup>3</sup> (*Q <sup>m</sup>*<sup>4</sup> *R*)*. If* E*<sup>X</sup>* = E*<sup>Y</sup> , then X* = *Y.*

*Proof.* Available in the extended version of this paper [27].

#### **3 Synthesis Problems**

The *mapping verification problem* is to check, given processes *P* and *Q*, mapping *m*, and specification Φ, whether (*P mQ*) |= Φ. This problem was studied by Kang et al. [28]. In this paper, we introduce and study, for the first time to our knowledge, the problem of *mapping synthesis*. We begin with a simple formulation of the problem and then generalize it. We will not define what exactly the specification Φ may be, neither the satisfaction relation |=, as the mapping synthesis problems defined below are generic and can work with any type of specification or satisfaction relation. In Sect. 5.1, we discuss how this generic framework is instantiated in our implementation.

*Problem 1 (Mapping Synthesis).* Given processes *P* and *Q*, and specification Φ, find, if it exists, a mapping *m* such that (*P mQ*) |= Φ. We call such an *m* a *valid* mapping.

Note that if Φ is a *trace* property [2,29], this problem can be stated as a ∃ ∀ problem; that is, finding a witness *m* to the formula ∃ *m* : ∀ *t* ∈ *TPmQ* : *t* ∈ Φ.

Instead of synthesizing *m* from scratch, the developer may wish to express their partial system knowledge as a given *constraint*, and ask the synthesis tool to generate a mapping that adheres to this constraint. For instance, given labels *a*, *b*, *c* ∈ *L*, one may express a constraint that *a* must be mapped to either *b* or *c* as part of every valid mapping; this gives rise to two possible candidate mappings, *m*<sup>1</sup> and *m*2, where *m*1(*a*) = *b* and *m*2(*a*) = *c*. Formally, let *M* be the set of all possible mappings between labels *L*. A *mapping constraint C* ⊆ *M* is a set of mappings that are considered legal candidates for a final, synthesized valid mapping. Then, the problem of synthesizing a mapping given a constraint can be formulated as follows:

*Problem 2 (Generalized Mapping Synthesis).* Given processes *P* and *Q*, specification Φ, and mapping constraint *C*, find, if it exists, a valid mapping *m* such that *m* ∈ *C*.

Note that Problem 1 is a special case of Problem 2 where *C* = *M*. The synthesis problem can be further generalized to one that involves synthesizing a constraint that contains a *set* of valid mappings:

*Problem 3 (Mapping Constraint Synthesis).* Given processes *P* and *Q*, specification Φ, and mapping constraint *C*, generate, if it exists, a non-empty set of valid mappings *C* such that *C* ⊆ *C*. We call such a *C* valid with respect to *P*, *Q*, Φ and *C*.

A procedure for solving Problem 3 can be used to solve Problem 2: Having generated constraint *C* , we can pick any mapping *m* ∈ *C* . Such an *m* is guaranteed to be valid and also to belong in *C*.

In practice, it is desirable for *C* to be as large as possible while still being valid, as it provides more implementation choices (i.e., possible mappings). In particular, we say that a mapping constraint *C* is *maximal* with respect to *P*, *Q*, Φ, and *C* if and only if (1) *C* is valid with respect to *P*, *Q*, Φ, and *C*, and (2) there exists no other constraint *C* such that *C* is also valid w.r.t. *P*, *Q*, Φ, *C*, and *C* ⊆ *C*. Then, our final synthesis problem can be stated as follows:

*Problem 4 (Maximal Constraint Synthesis).* Given processes *P* and *Q*, property Φ, and constraint *C*, generate, if it exists, a maximal constraint *C* with respect to *P*, *Q*, Φ, *C*.

If found, *C* is a *local* optimal solution. In general, there may be multiple maximal constraints for given *P*, *Q*, Φ, and *C*.

**Example.** Back to our running example, an alternative implementation of the abstract communication model over the public channel involves encrypting messages sent by Aliceto Bob using a key (y) that Eve does not possess; this decision can be encoded as the following *valid* mapping *m*2:

$$\mathsf{a.b.p} \longmapsto\_{m\_2} \mathsf{p.y} \quad \mathsf{a.b.s} \longmapsto\_{m\_2} \mathsf{s.y} \quad \mathsf{a.e.p} \longmapsto\_{m\_2} \mathsf{p.x} \quad \mathsf{a.e.s} \longmapsto\_{m\_2} \mathsf{s.y}$$

Since Eve cannot read messages encrypted using key y, she is unable to obtain Alice's secret over the public channel; thus, *Im*<sup>2</sup> |= Φ, where *Im*<sup>2</sup> ≡ (Alice *<sup>m</sup>*<sup>2</sup> Sender) (Eve *<sup>m</sup>*2RecvX).

The following mapping, *m*3, which leaves non-secret messages unencrypted in the low-level channel (as p), is also valid with respect to Φ:

$$\mathbf{a.b.p} \mapsto\_{m\_2} \mathbf{p} \quad \mathbf{a.b.s} \mapsto\_{m\_2} \mathbf{s.y} \quad \mathbf{a.e.p} \mapsto\_{m\_2} \mathbf{p} \quad \mathbf{a.e.s} \mapsto\_{m\_2} \mathbf{s.y}$$

since Eve being able to read non-secret messages does not violate the property. Thus, the developer may choose either *m*<sup>2</sup> or *m*<sup>3</sup> to implement the abstract channel and ensure that Alice's secret remains protected from Eve. In other words, *C*<sup>1</sup> = {*m*2, *m*3} is a valid (but not necessarily maximal) mapping constraint with respect to the desired property. Furthermore, *C*<sup>1</sup> is arguably more desirable than another constraint *C*<sup>2</sup> = {*m*2}, since the former gives the developer more implementation choices than the latter does.

#### **4 Synthesis Technique**

**Mapping Representation.** In our approach, mappings are represented *symbolically* as logical expressions over variables that correspond to labels being mapped. The symbolic representation has the following advantages over an explicit one (where the entries of mapping *m* are enumerated explicitly): (1) it provides a succinct representation of implementation decisions to the developer (which is especially important as the size of the mapping grows large) and (2) it allows the user to specify partial implementation decisions (i.e., given constraint *C*) in a *declarative* manner.

We adopt the symbolic representation and, inspired by SyGuS [3], use a *syntactic* approach where the space of candidate mapping constraints is restricted to expressions that can be constructed from a given grammar. Our grammar is specified as follows:

$$\begin{aligned} Term &:= Var \mid \text{Const} & \quad Assign := (Term = Term) \\ Expr &:= Assign \mid \neg Assign \mid Assign \Rightarrow Assign \mid Expr \land Expr \end{aligned}$$

where *Var* is a set of variables that represent parameters inside a label, and *Const* is the set of constant values. Intuitively, this grammar captures implementation decisions that involve assignments of parameters in an abstract label to their counterparts in a concrete label (represented by the equality operator "="). A logical implication is used to construct a conditional assignment of a parameter.

A mapping constraint is symbolically represented as a set of predicates, each of the form X (*abs*, *conc*) over *symbolic* labels *abs* and *conc*, where *abs* represents the label being mapped to *conc*. The body of each predicate is constructed as an expression from the above grammar. For example, let *abs* = a.b.*msg* be a symbolic encoding of labels that represent Alicecommunicating to Eve, with variable *msg* corresponding to the message being sent; similarly, let *conc* = *msg* .*key* be a symbolic label in the public channel model, where *msg* and *key* correspond to the message being transmitted and the key used to encrypt it (if any). Then, the expression

$$\mathcal{K}(\mathbf{a}.\mathbf{b}.m\mathbf{s}\mathbf{g}, m\mathbf{s}\mathbf{g'}.\mathbf{k}\mathbf{e}\mathbf{y}) \equiv m\mathbf{s}\mathbf{g} = m\mathbf{s}\mathbf{g'} \land (m\mathbf{s}\mathbf{g} = \mathbf{s} \Rightarrow k\mathbf{e}\mathbf{y} = \mathbf{y})$$

states that (1) parameter *msg* in the abstract label must be equal to that in the concrete label (i.e., the message being transmitted must be preserved during the mapping) and (2) if the message is a secret, key y must be used to encrypt it in the implementation.

The set of mappings that predicate X (*abs*, *conc*) represents is defined as:

$$C = \{ m: L \to L \mid \forall abs \in L: (abs \in dom(m) \Leftrightarrow \exists \\_conc \in L: \mathcal{X}(abs, conc)) \land \newline \qquad (abs \in dom(m) \Rightarrow \mathcal{X}(abs, m(abs)))\}$$

That is, a mapping *m* is allowed by X (*abs*, *conc*) if and only if for each label *abs*, (1) *m* is defined over *abs* if and only if there exists some label *conc* for which X (*abs*, *conc*) evaluates to true, and (2) *m* maps *abs* to such a label *conc*.

**Algorithmic Considerations.** To ensure that the algorithm terminates, the set of expressions that may be constructed using the given grammar is restricted to a finite set, by bounding the domains of data types (e.g., distinct messages and keys in our running example) and the size of expressions. We also assume the existence of a verifier that is capable of checking whether a candidate mapping satisfies a given specification Φ. The verifier implements function *verify*(*C*, *P*, *Q*, Φ) which returns *OK* if and only if every mapping allowed by constraint *C* is valid with respect to *P*, *Q*, Φ.

**Generalization Algorithm.** Once we limit the number of candidate expressions to be finite, we can use a brute-force algorithm to enumerate and check those candidates one by one. However, this naive algorithm is likely to suffer from scalability issues. Thus, we present an algorithm that takes a generalization-based approach to identify and prune undesirable parts of the search space. A key insight is that only a few implementation decisions—captured by some *minimal subset* of the entries in a mapping—may be sufficient to imply that the resulting implementation will be invalid. Thus, given some invalid mapping, the algorithm attempts to identify this minimal subset and construct a larger constraint *Cbad* that is guaranteed to contain only invalid mappings.

The outline of the algorithm is shown in Fig. 2. The function *synthesize* takes four inputs: processes *P* and *Q*, specification Φ, and a user-specified mapping constraint *C*. It also maintains a set of constraints *X*, which keeps track of "bad" regions of the search space that do not contain any valid mappings.

In each iteration, the algorithm selects some mapping *m* from *C* (line 3) and checks whether it belongs to one of the constraints in *X* (meaning, the mapping is guaranteed to result in an invalid implementation). If so, it is simply discarded (lines 4–5).

Otherwise, the verifier is used to check whether *m* is valid with respect to Φ (line 7). If so, then *generalize* is invoked to produce a maximal mapping constraint *Cmaximal*, which represents the largest set that contains {*m*}, is contained in *C*, and is valid with respect to *P*, *Q*, Φ (line 9). If, on the other hand, *m* is invalid (i.e., it fails to preserve Φ), then *generalize* is invoked to compute the largest superset *Cbad* of {*m*} that contains only invalid mappings (i.e., those that satisfy ¬Φ). The set *Cbad* is then added to *X* and used to prune out subsequent, invalid candidates (line 13).

**Fig. 2.** An algorithm for synthesizing a maximal mapping constraint.

**Constraint Generalization.** The function *generalize*(*C* , *P*, *Q*, Φ,*C*) computes a maximal set that contains *C* , is contained within *C*, and only permits mappings that satisfy Φ. This function is used in two different ways: (1) to identify an undesirable region of the candidate space that should be avoided, and (2) to produce a maximal version of a valid mapping constraint.

The procedure works by incrementally growing *C* into a larger set *Crelaxed* and stopping when *Crelaxed* contains at least one mapping that violates Φ. Suppose that constraint *C* is represented by a symbolic expression X , which itself is a conjunction of *n* subexpressions *k*<sup>1</sup> ∧ *k*<sup>2</sup> ∧ ... ∧ *kn*, where each *ki* for 1 ≤ *i* ≤ *n* represents a (possibly conditional) assignment of a variable or a constant to some label parameter. The function *decompose*(*C* ) takes the given constraint and returns the set of such subexpressions. The function *relax*(*C* , *ki*) then computes a new constraint by removing *k* from *C* ; this new constraint, *Crelaxed*, is a larger set of mappings that subsumes *C* .

The verifier is then used to check *Crelaxed* against Φ (line 22). If *Crelaxed* is still valid with respect to Φ, then the implementation decision encoded by *k* is irrelevant for Φ, meaning we can safely remove *k* from the final synthesized constraint *C* (line 24). If not, *k* is retained as part of *C* , and the algorithm moves onto the next subexpression *k* as a candidate for removal (line 20). On line 23, we also make sure that *Crelaxed* does not violate the predefined user constraints *C*.

**Example.** Let *abs* = a.e.*msg* be a symbolic label that represents Alice sending a message (*msg*) to Eve, and *conc* = *msg* .*key* be its corresponding label in the public channel model. Then, one candidate constraint *C* for mappings from the high-level to low-level labels can be specified as the following expression:

$$\mathcal{K}(\mathbf{a}\mathbf{a}\mathbf{s}.\mathbf{m}\mathbf{s}g,\mathbf{m}\mathbf{s}g'.\mathbf{k}\mathbf{y}) \equiv \mathbf{m}\mathbf{s}\mathbf{g} = \mathbf{m}\mathbf{s}\mathbf{g}' \land (\mathbf{m}\mathbf{s}\mathbf{g} = \mathbf{s} \Rightarrow \mathbf{k}\mathbf{y} = \mathbf{y}) \land (\mathbf{m}\mathbf{s}\mathbf{g} = \mathbf{p} \Rightarrow \mathbf{k}\mathbf{y} = \mathbf{x})$$

Suppose that this constraint *C* has been verified to be valid with respect to *P*, *Q* and Φ. Next, the generalization procedure removes the subexpression *k*<sup>1</sup> ≡ (*msg* = <sup>p</sup> ⇒ *key* = x) from *C* , resulting in constraint *Crelaxed* that is represented as:

$$\mathcal{K}(\mathbf{a}.\mathbf{e}.m\mathbf{s}g, m\mathbf{s}g'.\mathbf{k}\mathbf{e}\mathbf{y}) \equiv m\mathbf{s}\mathbf{g} = m\mathbf{s}\mathbf{g'} \land (m\mathbf{s}\mathbf{g} = \mathbf{s} \Rightarrow k\mathbf{e}\mathbf{y} = \mathbf{y})$$

When checked by the verifier (line 22), *C* is still considered valid, meaning that the decision encoded by *k*<sup>1</sup> is irrelevant to the property; thus, *k*<sup>1</sup> can be safely removed.

However, removing *k*<sup>2</sup> ≡ (*msg* = <sup>s</sup> ⇒ *key* = <sup>y</sup>) results in a violation of the property. Thus, *k*<sup>2</sup> is kept as part of the final maximal constraint expression.

#### **5 Implementation and Case Studies**

#### **5.1 Implementation**

We have built a prototype implementation2 of the synthesis algorithm described in Sect. 4. Our tool uses the Alloy Analyzer [25] as the underlying modeling and verification engine. Alloy's flexible, declarative relational logic is convenient for encoding the semantics of the mapping composition as well as specifying mapping constraints. The analysis engine for Alloy uses an off-the-shelf SAT solver to perform *bounded* verification [25]. In particular, our current prototype is capable of synthesizing mappings to preserve the following types of properties: *reachability* and *safety* properties, which can be expressed in either of the forms ∃ *t* : *t* ∈ *TP* ∧ *t* ∈ φ (reachability) and ¬ ∃ *t* : *t* ∈ *TP* ∧ *t* ∈ φ (safety) for some process *P* and property φ.

**Fig. 3.** A high-level overview of the two OAuth protocols, with a sequence of event labels that describe protocol steps in the typical order that they occur. Each arrowed edge indicates the direction of the communication. Variables inside labels with the prefix ret represent return parameters. For example, in Step 2 of OAuth 2.0, User passes their user ID and password as arguments to AuthServer, which returns ret code back to User in response.

<sup>2</sup> The tool, along with the models used in our case studies, is available at https://github.com/ eskang/MappingSynthesisTool.

However, our synthesis approach does not prescribe the use of a particular modeling and verification engine, and can be implemented using other tools as well (such as an SMT solver [11,12]).

#### **5.2 Case Studies: OAuth Protocols**

As two major case studies, we took on the problem of synthesizing valid mappings for *OAuth 1.0 and OAuth 2.0*, two real-world protocols used for *third-party authorization* [24]. The purpose of the OAuth protocol family in general is to allow an application (called a *client* in the OAuth terminology) to access a resource from another application (an *authorization server*) without needing the credentials of the resource owner (a *user*). For example, a gaming application may initiate an OAuth process to obtain a list of friends from a particular user's Facebook account, provided that the user has authorized Facebook to release this resource to the client.

OAuth 2.0 is the newer version of the protocol, while OAuth 1.0 is an older version. Although OAuth 2.0 is intended to be a replacement for OAuth 1.0, there has been much contention within the developer community about whether it actually improves over its predecessor in terms of security [17]. Since both protocols are designed to provide the same security guarantees (i.e., both share common properties), our goal was to apply our synthesis approach to systematically compare what developers would be required to do in order to construct secure web-based implementations of the two.

#### **5.3 Formal Modeling**

For our case studies, we constructed the following set of Alloy models: (1) model *P*1*.*<sup>0</sup> representing OAuth 1.0; (2) model *P*2*.*<sup>0</sup> representing OAuth 2.0; (3) model *Q* representing generic HTTP interactions between a browser and a server, as well as the behavior of a web-based attacker; (4) specification Φ describing desired protocol properties (same for both OAuth 1.0 and 2.0); and (5) mapping constraints *C*1*.*<sup>0</sup> and *C*2*.*<sup>0</sup> representing initial, user-specified partial mappings for OAuth 1.0 and 2.0, respectively. The complete models are approximately 1800 lines of Alloy code in total, and took around 4 manmonths to build. These models were then provided as inputs to our tool to solve two instances of Problem 4 from Sect. 3. In particular, we synthesized a maximal mapping constraint *C* <sup>1</sup>*.*<sup>0</sup> such that every *m* ∈ *C* <sup>1</sup>*.*<sup>0</sup> ensures that *P*1*.*<sup>0</sup> *mQ* |= Φ. and a maximal mapping constraint *C* <sup>2</sup>*.*<sup>0</sup> such that every *m* ∈ *C* <sup>2</sup>*.*<sup>0</sup> ensures that *P*2*.*<sup>0</sup> *mQ* |= Φ.

**OAuth Models** (*P*1*.*0, *P*2*.*0). We constructed Alloy models of OAuth 1.0 and 2.0 based on the official protocol specifications [23,24]. Due to limited space, we give only a brief overview of the models. Each model consists of four processes: Client, AuthServer, and two users, Aliceand Eve (the latter with a malicious intent to access Alice's resources).

A typical OAuth 2.0 workflow, shown in Fig. 3(a), begins with a user (Aliceor Eve) initiating a new protocol session with Client (initiate). The user is then asked to prove their own identity to AuthServer (by providing a user ID and a password) and officially authorize the client to access their resources (authorize). Given the user's authorization, the server then allocates a unique code for the user, and then redirects their back to the client. The user forwards the code to the client (forward), which then can exchange the code for an access token to their resources (getToken).

Like in OAuth 2.0, a typical workflow in OAuth 1.0 (depicted in Fig. 3(b)) begins with a user initiating a new session with Client (initiate). Instead of immediately directing the user to AuthServer, however, Client first obtains a *request token* from Auth-Server and associates it with the current session (getReqToken). The user is then asked to present the same request token to AuthServer and authorize Client to access their resources (authorize). Once notified by the user that the authorization step has taken place (notify), Client exchanges the request token for an access token that can be used subsequently to access their resources (getAccessToken).

**Specification** (Φ). There are two desirable properties of OAuth protocols in general: (1) **Authenticity**: When the client receives an access token, it must correspond to the user who initiated the current protocol session. (2) **Completion**: There exists at least one trace in which the protocol interactions are carried out to completion in the order of steps described in Fig. 3. Authenticity is a safety property while completion is a reachability property. The input specification Φ consists of these two properties. Completion is essential for ruling out mappings that over-constrain the resulting implementation and prevent certain steps of the protocol from being performed.

**HTTP Platform Model** (*Q*). Our goal was to explore and synthesize *web-based* implementations of OAuth. For this purpose, we constructed a formal model depicting interactions between a generic HTTP server and web browser. The model contains two types of processes, Server and Browser (which may be instantiated into multiple processes representing different servers and browsers). They interact with each other over HTTP requests, which share the following signature:

req(*method* : Method, *url* : URL, *headers* : List[Header], *body* : Body,*ret resp* : Resp)

The parameters of an HTTP request have their own internal structures, each consisting of its own parameters as follows:

url(*host* : Host, *path* : Path, *queries* : List[Query]) header(*name* : Name, *val* : Value) resp(*status* : Status, *headers* : List[Header], *body* : Body)


**Fig. 4.** User-specified partial mappings from OAuth 2.0 to HTTP. Terms highlighted in blue and red are variables that represent the parameters inside OAuth and HTTP labels, respectively. For example, in forward, the abstract parameters code and session may be transmitted as part of an URL query, a header, or the request body, although its URL is fixed to http://client.com/forward. (Color figure online)

Our model describes *generic*, *application-independent* HTTP interactions. In particular, each Browser process is a machine that constructs, at each communication step with Server, an arbitrary HTTP request by non-deterministically selecting a value for each parameter of the request. The processes, however, follow a *platform-specific* logic; for instance, when given a response from Server that instructs a browser cookie to be stored at a particular URL, Browser will include this cookie along with every subsequent request directed at that URL. In addition, the model includes a process that depicts the behavior of a web attacker, who may operate their own malicious server and exploit weaknesses in a browser to manipulate the user into sending certain HTTP requests.

**Mapping Constraint** (*C*1*.*0,*C*2*.*0). Building a web-based implementation of OAuth involves decisions about how abstract protocol operations are to be realized in terms of HTTP requests. As an input to the synthesizer, we specified an initial set of constraints that describe partial implementation decisions for both OAuth protocols; the ones for OAuth 2.0 are shown in Fig. 4. These decisions include a designation of fixed host and path names inside URLs for various OAuth operations (e.g., http:/client.com/initiate for the OAuth initiate event), and how certain parameters are transmitted as part of an HTTP request (ret session as a return cookie in initiate). It is reasonable to treat these constraints as given, since they describe decisions that are common across typical webbased OAuth implementations.

**Insecure Mapping for OAuth 2.0.** Let us now give an example of an insecure mapping that satisfies the user-given constraint in Fig. 4 but could introduce a security vulnerability into the resulting implementation. Later in Sect. 5.4, we describe how our tool can be used to synthesize a secure mapping that prevents this vulnerability.

Consider the OAuth 2.0 workflow from Fig. 3(a). In order to implement the forward operation, for instance, the developer must determine how the parameters *code* and *session* of the abstract event label are encoded using their concrete counterparts in an HTTP request. A number of choices is available. In one possible implementation, the authorization code may be transmitted as a query parameter inside the URL, and the session as a browser cookie, as described by the following constraint expression, X1:

$$\begin{aligned} \mathcal{X}\_1(a,b) &\equiv (b.method = \mathsf{POST}) \land (b.url.host = \mathsf{client.com}) \land \\ (b.url.path = \mathsf{forwards}) &\land (b.url.queries[0] = a.code) \land \\ (b.headers[0].name &= \mathsf{cooke}) \land (b.headers[0].value = a.sesmission) \end{aligned}$$

where POST, client.com, forward, and cookie are predefined constants; and *l*[*i*] refers to *i*-th element of list *l*.

This constraint, however, allows a vulnerable implementation where malicious user Eve performs the first two steps of the workflow in Fig. 3(a) using her own credentials, and obtains a unique code (codeEve) from the authorization server. Instead of forwarding this to Client (as she is expected to), Eve keeps the code herself, and crafts their own web page that triggers the visiting browser to send the following HTTP request:

req(POST, http://client.com/forward?codeEve, ...)

Suppose that Alice is a naive browser user who may occasionally be enticed or tricked into visiting malicious web sites. When Alice visits the page set up by Eve, Alice's browser automatically generates the above HTTP request, which, given the decisions in X1, corresponds to a valid forward event:

```
forward(codeEve, sessionAlice) →
req(POST, http://client.com/forward?codeEve, [(cookie, sessionAlice)], ...)
```
Due to the standard browser logic, the cookie corresponding to sessionAlice is included in every request to client.com. As a result, Client mistakenly accepts codeEve as the one for Alice, even though it belongs to Eve, violating the authenticity property of OAuth (this attack is also called *session swapping* [39]).

#### **5.4 Results**

Our synthesis tool was able to generate valid mapping constraints for both OAuth protocols. In particular, the constraints describe mitigations against attacks that exploit an interaction between the OAuth logic and security vulnerabilities in a web browser.

**OAuth 2.0.** The synthesized symbolic mapping constraint for OAuth 2.0 consists of 39 conjuncts in total, each capturing a (conditional) assignment of a concrete HTTP parameter to a constant (e.g., *b*.*url*.*path* = forward) or an abstract OAuth parameter (e.g., *b*.*url*.*queries*[0] = *a*.*code*). In particular, the constraint captures mitigations against *session swapping* [39] and *covert redirect* [16]. Due to limited space, we omit the full constraint, but instead describe how the vulnerability described at the end of Sect. 5.3 can be mitigated by our synthesized mapping.

Consider the insecure mapping expression X<sup>1</sup> from Sect. 5.3. The mapping constraint synthesized by our tool, X2, fixes the major problem of X1; namely, that in a browser-based implementation, the client cannot trust an authorization code as having originated from a particular user (e.g., Alice), since the code may be intercepted or interjected by an attacker (Eve) while in transit through a browser. A possible solution is to explicitly identify the origin of the code by requiring an additional piece of tracking information to be provided in each forward request. The mapping expression X<sup>2</sup> synthesized by our tool encodes one form of this solution:

$$\begin{aligned} \mathcal{X}\_2(a,b) & \equiv \mathcal{X}\_1(a,b) \land (a.\text{essision} = \mathsf{sesssion}\_{\mathsf{Alow}} \Rightarrow b.\text{url.} queries[1] = \mathsf{nonc}\mathsf{e}\_0) \land \\ (a.\text{ession} = \mathsf{sesssion}\_{\mathsf{Eve}} \Rightarrow b.\text{url.} queries[1] = \mathsf{nonc}\mathsf{e}\_1) \end{aligned}$$

where nonce*o*, nonce<sup>1</sup> <sup>∈</sup> Nonce are constants defined in the HTTP model3. In particular, X<sup>2</sup> stipulates that every forward request must include an additional value (nonce) as an argument besides the code and the session, and that this nonce be unique for each session value. X<sup>2</sup> ensures that the resulting implementation satisfies the desired properties of OAuth 2.

**OAuth 1.0.** The synthesized symbolic mapping constraint for OAuth 1.0 consists of 48 conjuncts in total, capturing how the abstract parameters of the five OAuth 1.0 operations are related to concrete HTTP parameters. The constraint synthesized by our tool

<sup>3</sup> A nonce is a unique piece of string intended to be used once in communication.


**Fig. 5.** Experimental results (all times in seconds). "# total candidates" is the total number of possible symbolic mapping expressions; "# explored" is the number of iterations taken by the main synthesis loop (lines 3–15, Fig. 2) before a solution was found. Out of these iterations, "# verified" mappings were verified (line 7), while the rest were identified as invalid and skipped (line 5). "Total time" the sum of the Total Verification and Generalization columns) refers to the time spent by the tool to synthesize a maximal constraint.

for OAuth 1.0 encodes a mitigation against the *session fixation* [15] attack; in short, this mitigation involves strengthening the notify operation with unique nonces (similar to the way the forward operation in OAuth 2.0 was fixed above) to prevent the attacker from violating the authenticity property.

**Performance.** Figure 5 shows experimental results for the two OAuth protocols4. Overall, the synthesizer took approximately 17.6 and 24.0 min to synthesize the constraints for 1.0 and 2.0, respectively. In both cases, the tool spent a considerable amount of time on the generalization step to learn the invalid regions of the search space. Note that generalization is effective at identifying and discarding a very large number of invalid candidates; it was able to skip 2184 out of 2465 candidates for OAuth 1.0 (≈88.6%) and 1292 out of 1453 for OAuth 2.0 (≈88.9%). Our generalization technique was particularly effective for the OAuth protocols, since a significant percentage of the candidate constraints would result in an implementation that violates the completion property (i.e., it prevents Aliceor Eve from completing a protocol session in an expected order). Often, the decisions contributing to this violation could be localized to a small subset of entries in a mapping (for example, attempting to send a cookie to a mismatched URL, which is inconsistent with the behavior of the browser process). By identifying this subset, our algorithm was able to discover and eliminate a large number of invalid mappings.

#### **6 Related Work**

Our approach has been inspired by the success of recent synthesis paradigms such as *sketching* [36–38], *oracle-guided synthesis* [26] and *syntax-guided synthesis* [3]. Our technique shares many similarities with these approaches in that (1) it allows the user to provide a partial specification of the artifact to be synthesized (in the form of constraints or examples), therefore having the underlying engine *complete* the remaining parts; (2) it relies on an interaction between the verifier, which checks candidate solutions, and the synthesizer, which prunes that search space based on previous invalid candidates. Our work also differs in a number of aspects. First, we synthesize mappings from highlevel models to low-level execution platforms, which to our knowledge has not been

<sup>4</sup> The experiments were performed on a Mac OS X 2.7 GHz laptop with 8G RAM and MiniSat [13] as the underlying SAT solver employed by the Alloy Analyzer.

considered before. Second, our approach leverages constraint generalization to not only prune the search space, but also to produce a constraint capturing a (locally) maximal set of valid mappings. Third, our application domain is in security protocols.

A large body of literature exists on *refinement-based* methods to system construction [4,20]. These approaches involve building an implementation *Q* that is a behavioral refinement of *P*; such *Q*, by construction, would satisfy the properties of *P*. In comparison, we start with an assumption that *Q* is a *given* platform, and that the developer may not have the luxury of being able to modify or build *Q* from scratch. Thus, instead of behavioral refinement (which may be too challenging to achieve), we aim to preserve some critical property φ when *P* is implemented using *Q*.

The task of synthesizing a valid mapping can be seen as a type of the *model merging* problem [8]. This problem has been studied in various contexts, including architectural views [31], behavioral models [6,32,40], and database schemas [34]. Among these, our work is most closely related to merging of partial behavioral models [6,40]. In these works, given a pair of models *M*<sup>1</sup> and *M*2, the goal is to construct *M* that is a behavioral refinement of both *M*<sup>1</sup> and *M*2. The approach proposed in this paper differs in that (1) the mapping composition involves merging a pair of events with distinct alphabet labels into a single event that retains all of those labels, and (2) the composed process (*P mQ*) need not be a behavioral refinement of *P* or *Q*, as long as it satisfies property φ.

Bhargavan and his colleagues presents a compiler that takes a high-level program written using *session types* [22] and automatically generates a low-level implementation [7]. This technique is closer to compilation than to synthesis in that it uses a fixed translation scheme from high-level to low-level operations in a specific language environment (.NET), without searching a space of possible translations. Synthesizing a low-level implementation from a high-level specification has also been studied in the context of data structures [18,19], although their underlying representation (relational algebra for data schema specification) is very different from ours (process algebra).

A significant contribution of our work is the production of formal models for realworld protocols such as OAuth and HTTP. There have been similar efforts by other researchers in building reusable models of the web for security analysis [1,5,14]. As far as we know, however, none of these models has been used for synthesis.

#### **7 Conclusions**

In this paper, we have proposed a novel system design methodology centered around the notion of *mappings*. We have presented novel *mapping synthesis problems* and an algorithm for efficiently synthesizing symbolic maximal valid mappings. In addition, we have validated our approach on realistic case studies involving the OAuth protocols.

Future directions include performance improvements (e.g., exploiting the fact that our generalization-based algorithm is easily parallelizable), combining our generalization-based synthesis method with a counter-example guided approach, and application of our synthesis approach to other domains beside security (e.g., platformbased design and embedded systems [35]).

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Synthesis

### **Synthesizing Approximate Implementations for Unrealizable Specifications**

Rayna Dimitrova<sup>1</sup>, Bernd Finkbeiner<sup>2</sup>, and Hazem Torfah2(B)

<sup>1</sup> University of Leicester, Leicester, UK <sup>2</sup> Saarland University, Saarbr¨ucken, Germany torfah@react.uni-saarland.de

**Abstract.** The unrealizability of a specification is often due to the assumption that the behavior of the environment is unrestricted. In this paper, we present algorithms for synthesis in bounded environments, where the environment can only generate input sequences that are ultimately periodic words (lassos) with finite representations of bounded size. We provide automata-theoretic and symbolic approaches for solving this synthesis problem, and also study the synthesis of approximative implementations from unrealizable specifications. Such implementations may violate the specification in general, but are guaranteed to satisfy the specification on at least a specified portion of the bounded-size lassos. We evaluate the algorithms on different arbiter specifications.

#### **1 Introduction**

The objective of reactive synthesis is to automatically construct an implementation of a reactive system from a high-level specification of its desired behaviour. While this idea holds a great promise, applying synthesis in practice often faces significant challenges. One of the main hurdles is that the system designer has to provide the right formal specification, which is often a difficult task [12]. In particular, since the system being synthesized is required to satisfy its requirements against all possible environments allowed by the specification, accurately capturing the designer's knowledge about the environment in which the system will execute is crucial for being able to successfully synthesize an implementation.

Traditionally, environment assumptions are included in the specification, usually given as a temporal logic formula. There are, however less explored ways of incorporating information about the environment, one of which is to consider a *bound on the size of the environment*, that is, a bound on the size of the state space of a transition system that describes the possible environment behaviours. Restricting the space of possible environments can render an unrealizable specification into a realizable one. The temporal synthesis under such

This work was partially supported by the German Research Foundation (DFG) as part of the Collaborative Research Center "Foundations of Perspicuous Software Systems" (TRR 248, 389792660), and by the European Research Council (ERC) Grant OSARES (No. 683300).

bounded environments was first studied in [6], where the authors extensively study the problem, in several versions, from the complexity-theoretic point of view.

In this paper, we follow a similar avenue of providing environment assumptions. However, instead of bounding the size of the state space of the environment, we associate a bound with the sequences of values of input signals produced by the environment. The infinite input sequences produced by a finite-state environment which interacts with a finite state system are ultimately periodic, and thus, each such infinite sequence <sup>σ</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup> <sup>I</sup> , over the input alphabet Σ<sup>I</sup> , can be represented as a *lasso*, which is a pair (u, v) of finite words u ∈ Σ<sup>∗</sup> <sup>I</sup> and <sup>v</sup> <sup>∈</sup> <sup>Σ</sup><sup>+</sup> I , such that <sup>σ</sup> <sup>=</sup> <sup>u</sup> · <sup>v</sup><sup>ω</sup>. It is the length of such sequences that we consider a bound on. More precisely, given a bound <sup>k</sup> <sup>∈</sup> <sup>N</sup>, we consider the language of all infinite sequences of inputs that can be represented by a lasso (u, v) with |u · v| = k. The goal of the *synthesis of lasso precise implementations* is then to synthesize a system for which each execution resulting from a sequence of environment inputs in that language, satisfies a given linear temporal specification.

As an example, consider an arbiter serving two client processes. Each client issues a request when it wants to access a shared resource, and keeps the request signal up until it is done using the resource. The goal of the arbiter is to ensure the classical mutual exclusion property, by not granting access to the two clients simultaneously. The arbiter has to also ensure that each client request is eventually granted. This, however, is difficult since, first, a client might gain access to the resource and never lower the request signal, and second, the arbiter is not allowed to take away a grant unless the request has been set to false, or the client never sets the request to false in the future (the client has become unresponsive). The last two requirements together make the specification unrealizable, as the arbiter has no way of determining if a client has become unresponsive, or will lower the request signal in the future. If, however, the length of the lassos of the input sequences is bounded, then, after a sufficient number of steps, the arbiter can assume that if the request has not been set to false, then it will not be lowered in the future either, as the sequence of inputs must already have run at least once through it's period that will be ultimately repeated from that point on.

Formally, we can express the requirements on the arbiter in Linear Temporal Logic (LTL) as follows. There is one input variable r<sup>i</sup> (for *request*) and one output variable *g*<sup>i</sup> (for grant) associated with each client. The specification is then given as the conjunction ϕ = ϕmutex ∧ ϕresp ∧ ϕrel where we use the LTL operators Next , Globally and Eventually to define the requirements

$$\begin{array}{lcl}\varphi\_{mutex} = \Box \neg (g\_1 \land g\_2),\\\varphi\_{resp} = \Box \bigwedge\_{i=1}^2 (r\_i \to \bigrhd g\_i),\\\varphi\_{rel} = \Box \bigwedge\_{i=1}^2 (g\_i \land r\_i \land (\bigrhd \neg r\_i) \to \bigrhd g\_i).\end{array}$$

Due to the requirement to not revoke grants stated in ϕ*rel* , the specification ϕ is unrealizable (that is, there exists no implementation for the arbiter process). For any bound k on the length of the input lassos, however, ϕ is realizable. More precisely, there exists an implementation in which once client i has not lowered the request signal for k consecutive steps, the variable g<sup>i</sup> is set to false.

This example shows that when the system designer has knowledge about the resources available to the environment processes, taking this knowledge into account can enable us to synthesize a system that is correct under this assumption.

In this paper we formally define the synthesis problem for *lasso-precise implementations*, that is, implementations that are correct for input lassos of bounded size, and describe an automata-theoretic approach to this synthesis problem. We also consider the synthesis of *lasso-precise implementations of bounded size*, and provide a symbolic synthesis algorithm based on quantified Boolean satisfiability.

Bounding the size of the input lassos can render some unrealizable specifications realizable, but, similarly to bounding the size of the environment, comes at the price of higher computational complexity. To alleviate this problem, we further study the synthesis of *approximate implementations*, where we relax the synthesis problem further, and only require that for a given > 0 the ratio of input lassos of a given size for which the specification is satisfied, to the total number of input lassos of that size is at least 1 − . We then propose an *approximate synthesis method* based on maximum model counting for Boolean formulas [5]. The benefits of the approximate approach are two-fold. Firstly, it can often deliver high-quality approximate solutions more efficiently than the lasso-precise synthesis method, and secondly, even when the specification is still unrealizable for a given lasso bound, we might be able to synthesize an implementation that is correct for a given fraction of the possible input lassos.

The rest of the paper is organized as follows. In Sect. 2 we discuss related work on environment assumptions in synthesis. In Sect. 3 we provide preliminaries on linear temporal properties and omega-automata. In Sect. 3 we define the synthesis problem for lasso-precise implementations, and describe an automatatheoretic synthesis algorithm. In Sect. 5 we study the synthesis of lasso-precise implementations of bounded size, and provide a reduction to quantified Boolean satisfiability. In Sect. 6 we define the approximate version of the problem, and give a synthesis procedure based on maximum model counting. Finally, in Sect. 7 we present experimental results, and conclude in Sect. 8.

#### **2 Related Work**

Providing good-quality environment specifications (typically in the form of assumptions on the allowed behaviours of the environment) is crucial for the synthesis of implementations from high-level specifications. Formal specifications, and thus also environment assumptions, are often hard to get right, and have been identified as one of the bottlenecks in formal methods and autonomy [12]. It is therefore not surprising, that there is a plethora of approaches addressing the problem of how to revise inadequate environment assumptions in the cases when these are the cause of unrealizability of the system requirements.

Most approaches in this direction build upon the idea of analyzing the cause of unrealizability of the specification and extracting assumptions that help eliminate this cause. The method proposed in [2] uses the game graph that is used to answer the realizability question in order to construct a B¨uchi automaton representing a minimal assumption that makes the specification realizable. The authors of [8] provide an alternative approach where the environment assumptions are gradually strengthened based on counterstrategies for the environment. The key ingredient for this approach is using a library of specification templates and user scenarios for the mining of assumptions, in order to generate good-quality assumptions. A similar approach is used in [1], where, however, assumption patterns are synthesized directly from the counterstrategy without the need for the user to provide patterns. A different line of work focuses on giving feedback to the user or specification designer about the reason for unrealizability, so that they can, if possible, revise the specification accordingly. The key challenge adressed there lies in providing easy-to-understand feedback to users, which relies on finding a minimal cause for why the requirements are not achievable and generating a natural language explanation of this cause [11].

In the above mentioned approaches, assumptions are provided or constructed in the form of a temporal logic formula or an omega-automaton. Thus, it is on the one hand often difficult for specification designers to specify the right assumptions, and on the other hand special care has to be taken by the assumption generation procedures to ensure that the constructed assumptions are simple enough for the user to understand and evaluate. The work [6] takes a different route, by making assumptions about the *size* of the environment. That is, including as an additional parameter to the synthesis problem a bound on the state space of the environment. Similarly to temporal logic assumptions, this relaxation of the synthesis problem can render unrealizable specifications into realizable ones. From the system designer point of view, however, it might be significantly easier to estimate the size of environments that are feasible in practice than to express the implications of this additional information in a temporal logic formula. In this paper we take a similar route to [6], and consider a bound on the cyclic structures in the environment's behaviour. Thus, the closest to our work is the temporal synthesis for bounded environments studied in [6]. In fact, we show that the synthesis problem for lasso-precise implementations and the synthesis problem under bounded environments can be reduced to each other. However, while the focus in [6] is on the computational complexity of the bounded synthesis problems, here we provide both automata-theoretic, as well as symbolic approaches for solving the synthesis problem for environments with bounded lassos. We further consider an *approximate version of this synthesis problem*. The benefits of using approximation are two-fold. Firstly, as shown in [6], while bounding the environment can make some specifications realizable, this comes at a high computational complexity price. In this case, approximation might be able to provide solutions of sufficient quality more efficiently. Furthermore, even after bounding the environment's input behaviours, the specification might still remain unrealizable, in which case we would like to satisfy the requirements for as many input lassos as possible. In that sense, we get closer to synthesis methods for probabilistic temporal properties in probabilistic environments [7]. However, we consider non-probabilistic environments (i.e., all possible inputs are equally likely), and provide probabilistic guarantees with desired confidence by employing maximum model counting techniques. Maximum model counting has previously been used for the synthesis of approximate non-reactive programs [5]. Here, on the other hand we are concerned with the synthesis of reactive systems from temporal specifications.

Bounding the size of the synthesized system implementation is a complementary restriction of the synthesis problem, which has attracted a lot of attention in recent years [4]. The computational complexity of the synthesis problem when both the system's and the environment's size is bounded has been studied in [6]. In this paper we provide a symbolic synthesis procedure for bounded synthesis of lasso-precise implementations based on quantified Boolean satisfiability.

#### **3 Preliminaries**

We now recall definitions and notation from formal languages and automata, and notions from reactive synthesis such as implementation and environment.

*Linear-Time Properties and Lassos.* A *linear-time property* ϕ over an alphabet Σ is a set of infinite words <sup>ϕ</sup> <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup>. Elements of <sup>ϕ</sup> are called *models* of <sup>ϕ</sup>. A *lasso* of length <sup>k</sup> over an alphabet <sup>Σ</sup> is a pair (u, v) of finite words <sup>u</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> and <sup>v</sup> <sup>∈</sup> <sup>Σ</sup><sup>+</sup> with <sup>|</sup><sup>u</sup> · <sup>v</sup><sup>|</sup> <sup>=</sup> <sup>k</sup> that induces the ultimately periodic word <sup>u</sup> · <sup>v</sup><sup>ω</sup>. We call <sup>u</sup> · <sup>v</sup> the *base* of the lasso or ultimately periodic word, and k the *length* of the lasso.

If a word <sup>w</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> is a prefix of a word <sup>σ</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> <sup>∪</sup> <sup>Σ</sup><sup>ω</sup>, we write w<σ. For a language <sup>L</sup> <sup>⊆</sup> <sup>Σ</sup><sup>∗</sup> <sup>∪</sup> <sup>Σ</sup><sup>ω</sup>, we define *Prefix* (L) = {<sup>w</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> | ∃<sup>σ</sup> <sup>∈</sup> <sup>L</sup> : w<σ} is the set of all finite words that are prefixes of words in L.

*Implementations.* We represent implementations as *labeled transition systems*. Let I and O be finite sets of *input* and *output atomic propositions* respectively. A <sup>2</sup><sup>O</sup>-labeled 2<sup>I</sup> -transition system is a tuple <sup>T</sup> = (T,t0,τ,o), consisting of a finite set of states <sup>T</sup>, an initial state <sup>t</sup><sup>0</sup> <sup>∈</sup> <sup>T</sup>, a transition function <sup>τ</sup> : <sup>T</sup> <sup>×</sup> <sup>2</sup><sup>I</sup> <sup>→</sup> <sup>T</sup>, and a labeling function <sup>o</sup>: <sup>T</sup> <sup>→</sup> <sup>2</sup><sup>O</sup>. We denote by |T | the size of an implementation <sup>T</sup> , defined as |T | <sup>=</sup> <sup>|</sup>T|. A *path* in <sup>T</sup> is a sequence <sup>π</sup> : <sup>N</sup> <sup>→</sup> <sup>T</sup> <sup>×</sup> <sup>2</sup><sup>I</sup> of states and inputs that follows the transition function, i.e., for all <sup>i</sup> <sup>∈</sup> <sup>N</sup> if <sup>π</sup>(i)=(ti, ei) and π(i + 1) = (ti+1, ei+1), then ti+1 = τ (ti, ei). We call a path *initial* if it starts with the initial state: <sup>π</sup>(0) = (t0, e) for some <sup>e</sup> <sup>∈</sup> <sup>2</sup><sup>I</sup> . For an initial path <sup>π</sup>, we call the sequence <sup>σ</sup><sup>π</sup> : <sup>i</sup> → (o(ti) <sup>∪</sup> <sup>e</sup>i) <sup>∈</sup> (2<sup>I</sup>∪<sup>O</sup>)<sup>ω</sup> the *trace* of <sup>π</sup>. We call the set of traces of a transition system T the *language of* T , denoted L(T ).

*Finite-state environments* can be represented as labelled transition systems in a similar way, with the difference that the inputs are the outputs of the implementation, and the states of the environment are labelled with inputs for the implementation. More precisely, a finite-state environment is a 2<sup>I</sup> -labeled <sup>2</sup>O-transition system <sup>E</sup> = (E,s0, ρ, ι). The composition of an implementation <sup>T</sup> and an environment E results in a set of traces of T , which we denote L<sup>E</sup> (T ), where σ = σ0σ<sup>1</sup> ... ∈ L<sup>E</sup> (T ) if and only if σ ∈ L(T ) and there exists an initial path <sup>s</sup>0s<sup>1</sup> ... in <sup>E</sup> such that for all <sup>i</sup> <sup>∈</sup> <sup>N</sup>, <sup>s</sup>i+1 <sup>=</sup> <sup>ρ</sup>(si, σi+1∩O) and <sup>σ</sup>i∩<sup>I</sup> <sup>=</sup> <sup>ι</sup>(si).

*Linear-Time Temporal Logic.* We specify properties of reactive systems (implementations) as formulas in Linear-time Temporal Logic (LTL) [9]. We consider the usual temporal operators Next , Until U, and the derived operators Release R, which is the dual operator of U, Eventually and Globally . LTL formulas are defined over a set of atomic propositions *AP*. We denote the satisfaction of an LTL formula <sup>ϕ</sup> by an infinite sequence <sup>σ</sup> <sup>∈</sup> (2AP )<sup>ω</sup> of valuations of the atomic propositions by σ |= ϕ and call σ a *model* of ϕ. For an LTL formula ϕ we define the language <sup>L</sup>(ϕ) of <sup>ϕ</sup> to be the set {<sup>σ</sup> <sup>∈</sup> (2AP )<sup>ω</sup> <sup>|</sup> <sup>σ</sup> <sup>|</sup><sup>=</sup> <sup>ϕ</sup>}.

For a set of atomic propositions *AP* <sup>=</sup> <sup>O</sup> <sup>∪</sup> <sup>I</sup>, we say that a 2<sup>O</sup>-labeled 2<sup>I</sup> transition system T satisfies an LTL formula ϕ, if and only if L(T ) ⊆ L(ϕ), i.e., every trace of T satisfies ϕ. In this case we call T a *model* of ϕ, denoted T |= ϕ. If T satisfies ϕ for an environment E, i.e. L<sup>E</sup> (T ) ⊆ L(ϕ), we write T |=<sup>E</sup> ϕ.

For <sup>I</sup> <sup>⊆</sup> AP and <sup>σ</sup> <sup>∈</sup> (2AP )<sup>∗</sup> <sup>∪</sup> (2AP )<sup>ω</sup>, we denote with <sup>σ</sup>|<sup>I</sup> the projection of σ on I, obtained by the sequence of valuations of the propositions from I in σ.

*Automata Over Infinite Words.* The automata-theoretic approach to reactive synthesis relies on the fact that an LTL specification can be translated to an automaton over infinite words, or, alternatively, that the specification can be provided directly as such an automaton. An *alternating parity automaton* over an alphabet Σ is a tuple A = (Q, q0, δ, μ), where Q denotes a finite set of states, Q<sup>0</sup> ⊆ Q denotes a set of initial states, δ denotes a transition function, and <sup>μ</sup> : <sup>Q</sup> <sup>→</sup> <sup>C</sup> <sup>⊂</sup> <sup>N</sup> is a coloring function. The transition function <sup>δ</sup> : <sup>Q</sup>×<sup>Σ</sup> <sup>→</sup> <sup>B</sup><sup>+</sup>(Q) maps a state and an input letter to a positive Boolean combination of states [14].

A tree T over a set of directions D is a prefix-closed subset of D∗. The empty sequence is called the root. The children of a node n ∈ T are the nodes {n· d ∈ T | d ∈ D}. A Σ-labeled tree is a pair (T,l), where l : T → Σ is the labeling function. A *run* of <sup>A</sup> = (Q, q0, δ, μ) on an infinite word <sup>σ</sup> <sup>=</sup> <sup>α</sup>0α<sup>1</sup> ··· ∈ <sup>Σ</sup><sup>ω</sup> is a Q-labeled tree (T,l) that satisfies the following constraints: (1) l() = q0, and (2) for all n ∈ T, if l(n) = q, then {l(n ) | n is a child of n} satisfies δ(q, α|n|).

A run tree is *accepting* if every branch either hits a *true* transition or is an infinite branch n0n1n<sup>2</sup> ···∈ T, and the sequence l(n0)l(n1)l(n2)... satisfies the *parity condition*, which requires that the highest color occurring infinitely often in the sequence <sup>μ</sup>(l(n0))μ(l(n1))μ(l(n2))··· ∈ <sup>N</sup><sup>ω</sup> is even. An infinite word <sup>σ</sup> is accepted by an automaton A if there exists an accepting run of A on σ. The set of infinite words accepted by A is called its *language*, denoted L(A).

A *nondeterministic* automaton is a special alternating automaton, where for all states q and input letters α, δ(q, α) is a disjunction. An alternating automaton is called *universal* if, for all states q and input letters α, δ(q, α) is a conjunction. A universal and nondeterministic automaton is called *deterministic*.

A parity automaton is called a *B¨uchi* automaton if and only if the image of μ is contained in {1, 2}, a *co-B¨uchi* automaton if and only if the image of α is contained in {0, 1}. B¨uchi and co-B¨uchi automata are denoted by (Q, Q0, δ, F), where F ⊆ Q denotes the states with the higher color. A run graph of a B¨uchi automaton is thus accepting if, on every infinite path, there are infinitely many visits to states in F; a run graph of a co-B¨uchi automaton is accepting if, on every path, there are only finitely many visits to states in F.

The next theorem states the relation between LTL and alternating B¨uchi automata, namely that every LTL formula ϕ can be translated to an alternating B¨uchi automaton with the same language and size linear in the length of ϕ.

**Theorem 1.** [13] *For every LTL formula* ϕ *there is an alternating B¨uchi automaton* A *of size* O(|ϕ|) *with* L(A) = L(ϕ)*, where* |ϕ| *is the length of* ϕ*.*

*Automata Over Finite Words.* We also use automata over finite words as acceptors for languages consisting of prefixes of traces. A nondeterministic finite automaton over an alphabet Σ is a tuple A = (Q, Q0, δ, F), where Q and Q<sup>0</sup> ⊆ Q are again the states and initial states respectively, <sup>δ</sup> : <sup>Q</sup> <sup>×</sup> <sup>Σ</sup> <sup>→</sup> <sup>2</sup><sup>Q</sup> is the transition function and F is the set of accepting states. A run on a word a<sup>1</sup> ...a<sup>n</sup> is a sequence of states q0q<sup>1</sup> ...qn, where q<sup>0</sup> ∈ Q<sup>0</sup> and qi+1 ∈ δ(qi, ai). The run is accepting if q<sup>n</sup> ∈ F. Deterministic finite automata are defined similarly with the difference that there is a single initial state q0, and that the transition function is of the form δ : Q × Σ → Q. As usual, we denote the set of words accepted by a nondeterministic or deterministic finite automaton A by L(A).

#### **4 Synthesis of Lasso-Precise Implementations**

In this section we first define the synthesis problem for environments producing input sequences representable as lassos of length bounded by a given number. We then provide an automata-theoretic algorithm for this synthesis problem.

#### **4.1 Lasso-Precise Implementations**

We begin by formally defining the language of sequences of input values representable by lassos of a given length k. For the rest of the section, we consider linear-time properties defined over a set of atomic propositions *AP*. The subset I ⊆ *AP* consists of the input atomic propositions controlled by the environment.

**Definition 1 (Bounded Model Languages).** *Let* ϕ *be a linear-time property over a set of atomic propositions AP, let* <sup>Σ</sup> = 2*AP, and let* <sup>I</sup> <sup>⊆</sup> *AP.*

*We say that an infinite word* <sup>σ</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup> *is an* <sup>I</sup>*-*k*-model of* <sup>ϕ</sup>*, for a bound* <sup>k</sup> <sup>∈</sup> <sup>N</sup>*, if and only if there are words* <sup>u</sup> <sup>∈</sup> (2<sup>I</sup> )<sup>∗</sup> *and* <sup>v</sup> <sup>∈</sup> (2<sup>I</sup> )<sup>+</sup> *such that* <sup>|</sup><sup>u</sup> · <sup>v</sup><sup>|</sup> <sup>=</sup> <sup>k</sup> *and* <sup>σ</sup>|<sup>I</sup> <sup>=</sup> <sup>u</sup> · <sup>v</sup><sup>ω</sup>*. The language of* <sup>I</sup>*-*k*-models of the property* <sup>ϕ</sup> *is defined by the set* LI <sup>k</sup>(ϕ) = {<sup>σ</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup> <sup>|</sup> <sup>σ</sup> *is a* <sup>I</sup>*-*k*-model of* <sup>ϕ</sup>}*.*

Note that a model of ϕ might be induced by lassos of different length and by more than one lasso of the same length, e.g, a<sup>ω</sup> is induced by (a, a) and (, aa). The next lemma establishes that if a model of ϕ can be represented by a lasso of length k then it can also be represented by a lasso of any larger length.

**Lemma 1.** *For a linear-time property* <sup>ϕ</sup> *over* <sup>Σ</sup> = 2*AP, subset* <sup>I</sup> <sup>⊆</sup> *AP of atomic propositions, and bound* <sup>k</sup> <sup>∈</sup> <sup>N</sup>*, we have* <sup>L</sup><sup>I</sup> <sup>k</sup>(ϕ) <sup>⊆</sup> <sup>L</sup><sup>I</sup> k-(ϕ) *for all* k > k.

*Proof.* Let <sup>σ</sup> <sup>∈</sup> <sup>L</sup><sup>I</sup> <sup>k</sup>(ϕ). Then, <sup>σ</sup> <sup>|</sup><sup>=</sup> <sup>ϕ</sup> and there exists (u, v) <sup>∈</sup> (2<sup>I</sup> )<sup>∗</sup> <sup>×</sup> (2<sup>I</sup> )<sup>+</sup> such that <sup>|</sup><sup>u</sup> · <sup>v</sup><sup>|</sup> <sup>=</sup> <sup>k</sup> and <sup>σ</sup>|<sup>I</sup> <sup>=</sup> <sup>u</sup> · <sup>v</sup><sup>ω</sup>. Let <sup>v</sup> <sup>=</sup> <sup>v</sup><sup>1</sup> ...vk. Since <sup>u</sup> · <sup>v</sup>1(v<sup>2</sup> ...vkv1)<sup>ω</sup> <sup>=</sup> <sup>u</sup> · (v<sup>1</sup> ...vk)<sup>ω</sup> <sup>=</sup> <sup>σ</sup>|<sup>I</sup> , we have <sup>σ</sup> <sup>∈</sup> <sup>L</sup><sup>I</sup> <sup>k</sup>+1(ϕ). The claim follows by induction. 

Using the definition of I-k-models, the language of infinite sequences of environment inputs representable by lassos of length k can be expressed as L<sup>I</sup> <sup>k</sup>(Σ<sup>ω</sup>).

**Definition 2 (**k**-lasso-precise Implementations).** *For a linear-time property* <sup>ϕ</sup> *over* <sup>Σ</sup> = 2*AP, subset* <sup>I</sup> <sup>⊆</sup> *AP of atomic propositions, and bound* <sup>k</sup> <sup>∈</sup> <sup>N</sup>*, we say that a transition system* T *is a* k*-lasso-precise implementation of* ϕ*, denoted* T |=k,I <sup>ϕ</sup>*, if it holds that* <sup>L</sup><sup>I</sup> <sup>k</sup>(L(T )) ⊆ ϕ*.*

That is, in a k-lasso-precise implementation T all the traces of T that belong to the language L<sup>I</sup> <sup>k</sup>(Σ<sup>ω</sup>) are I-k-models of the specification ϕ.

#### **Problem definition: Synthesis of Lasso-Precise Implementations**

Given a linear-time property ϕ over atomic propositions *AP* with input atomic propositions <sup>I</sup>, and given a bound <sup>k</sup> <sup>∈</sup> <sup>N</sup>, construct an implementation <sup>T</sup> such that T |=k,I ϕ, or determine that such an implementation does not exist.

Another way to bound the behaviour of the environment is to consider a bound on the size of its state space. The *synthesis problem for bounded environments* asks for a given linear temporal property <sup>ϕ</sup> and a bound <sup>k</sup> <sup>∈</sup> <sup>N</sup> to synthesize a transition system T such that for every possible environment E of size at most k, the transition system T satisfies ϕ under environment E, i.e., T |=<sup>E</sup> ϕ.

We now establish the relationship between the synthesis of lasso-precise implementations and synthesis under bounded environments. Intuitively, the two synthesis problems can be reduced to each other since an environment of a given size, interacting with a given implementation, can only produce ultimately periodic sequences of inputs representable by lassos of length determined by the sizes of the environment and the implementation. This intuition is formalized in the following proposition, stating the connection between the two problems.

**Proposition 1.** *Given a specification* ϕ *over a set of atomic propositions AP with subset* I ⊆ AP *of atomic propositions controlled by the environment, and a bound* <sup>k</sup> <sup>∈</sup> <sup>N</sup>*, for every transition system* <sup>T</sup> *the following statements hold:*

*(1) If* T |=<sup>E</sup> ϕ *for all environments* E *of size at most* k*, then* T |=k,I ϕ*. (2) If* T |=<sup>k</sup>·|T |,I ϕ*, then* T |=<sup>E</sup> ϕ *for all environments* E *of size at most* k*.* *Proof.* For *(1)*, let T be a transition system such that T |=<sup>E</sup> ϕ for all environments E of size at most k. Assume, for the sake of contradiction, that T |=k,I ϕ. Thus, that there exists a word <sup>σ</sup> <sup>∈</sup> <sup>L</sup>(<sup>T</sup> ), such that <sup>σ</sup> <sup>∈</sup> <sup>L</sup><sup>I</sup> <sup>k</sup>(Σω) and <sup>σ</sup> <sup>|</sup><sup>=</sup> <sup>ϕ</sup>.

Since <sup>σ</sup> <sup>∈</sup> <sup>L</sup><sup>I</sup> <sup>k</sup>(Σω), we can construct an environment <sup>E</sup> of size at most <sup>k</sup> that produces the sequence of inputs σ|<sup>I</sup> . Since E is of size at most k, we have that T |=<sup>E</sup> ϕ. Thus, since σ ∈ L<sup>E</sup> (T ), we have σ |= ϕ, which is a contradiction.

For *(2)*, let T be a transition system such that T |=k·|T |,I ϕ. Assume, for the sake of contradiction that there exists an environment E of size at most k such that T |=<sup>E</sup> ϕ. Since T |=<sup>E</sup> ϕ, there exists σ ∈ L<sup>E</sup> (T ) such that σ |= ϕ. As the number of states of E is at most k, the input sequences it generates can be represented as lassos of size <sup>k</sup> ·|T |. Thus, <sup>σ</sup> <sup>∈</sup> <sup>L</sup><sup>I</sup> <sup>k</sup>·|T |(Σ<sup>ω</sup>). This is a contradiction with the choice of T , according to which T |=<sup>k</sup>·|T |,I ϕ. 

#### **4.2 Automata-Theoretic Synthesis of Lasso-Precise Implementations**

We now provide an automata-theoretic algorithm for the synthesis of lassoprecise implementations. The underlying idea of this approach is to first construct an automaton over finite traces that accepts all finite prefixes of traces in LI <sup>k</sup>(Σ<sup>ω</sup>). Then, combining this automaton and an automaton representing the property ϕ we can construct an automaton whose language is non-empty if and only if there exists an k-lasso-precise implementation of ϕ.

The next theorem presents the construction of a deterministic finite automaton for the language *Prefix* (L<sup>I</sup> <sup>k</sup>(Σ<sup>ω</sup>)).

**Theorem 2.** *For any set AP of atomic propositions, subset* I ⊆ *AP, and bound* <sup>k</sup> <sup>∈</sup> <sup>N</sup> *there is a deterministic finite automaton* <sup>A</sup><sup>k</sup> *over alphabet* <sup>Σ</sup> = 2*AP, with size* (2|I<sup>|</sup> + 1)<sup>k</sup> · (<sup>k</sup> + 1)<sup>k</sup>*, such that* <sup>L</sup>(Ak) = {<sup>w</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> | ∃<sup>σ</sup> <sup>∈</sup> <sup>L</sup><sup>I</sup> <sup>k</sup>(Σ<sup>ω</sup>).w<σ}.

*Idea & Construction.* For given <sup>k</sup> <sup>∈</sup> <sup>N</sup> we first define an automaton <sup>A</sup><sup>k</sup> <sup>=</sup> (Q, q0, δ, F) over <sup>Σ</sup> = 2<sup>I</sup> , such that <sup>L</sup>(A<sup>k</sup>) = {w <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> | ∃σ <sup>∈</sup> <sup>L</sup><sup>I</sup> <sup>k</sup>(Σ<sup>ω</sup>). w < <sup>σ</sup>}. That, is <sup>L</sup>(A<sup>k</sup>) is the set of all finite prefixes of infinite words over <sup>Σ</sup> that can be represented by a lasso of length k. We can then define the automaton A<sup>k</sup> as the automaton that for each <sup>w</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> simulates <sup>A</sup><sup>k</sup> on the projection <sup>w</sup>|<sup>I</sup> of <sup>w</sup>.

We define the automaton <sup>A</sup><sup>k</sup> = (Q, q0, δ, F) such that

$$\begin{array}{l} - \ Q = (\widehat{\Sigma} \cup \{\#\})^k \times \{-, 1, \dots, k\}^k, \\ - \ q\_0 = (\#^k, (1, 2, \dots, k)), \end{array}$$

$$\begin{aligned} & -\delta(q,\alpha) = \begin{cases} (w\cdot\alpha \cdot \#^{m-1},t) & \text{if } q = (w\cdot\#^{m},t) \text{ where } 1 \le m \le k, \\ & w \in \hat{\Sigma}^{(k-m)}, \; t \in \{-1,\ldots,k\}^{k} \\\\ (w,(i\_1',\ldots,i\_k')) & \text{if } q = (w,(i\_1,\ldots,i\_k)) \text{ where } w \in \hat{\Sigma}^{k}, \text{ and } \\\\ & i\_j' = \begin{cases} - & i\_j \le k \land w(i\_j) \ne \alpha \text{ or } i\_j = -1 \\\\ i\_j + 1 & i\_j < k \land w(i\_j) = \alpha \\\\ \vdots & \vdots & \vdots \\\\ \end{cases} \end{aligned} \\\\ & \text{i.e. } F = Q \backslash \{ (w,(-,\ldots,-)) \mid w \in \hat{\Sigma}^{k} \}. \end{aligned}$$

*Proof.* States of the form (<sup>w</sup> · <sup>α</sup> · #<sup>m</sup>, t) with <sup>m</sup> <sup>≥</sup> 1 store the portion of the input word read so far, for input words of length smaller than k. In states of this form we have t = (1, 2,...,k), which implies that all such states are accepting. In turn, this means that A<sup>k</sup> accepts all words of length smaller or equal to k. This is justified by the fact that, each word of length smaller or equal to k is a prefix of an infinite word in L<sup>I</sup> <sup>k</sup>(Σ<sup>ω</sup>), obtained by repeating the prefix infinitely often. Now, let us consider words of length greater than k.

In states of the form (u,(i1,...,ik)), with <sup>u</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup>, the word <sup>u</sup> stores the first k letters of the input word. Intuitively, the tuple (i1,...,ik) stores the information about the loops that are still possible, given the portion of the input word that is read thus far. To see this, let us consider a word <sup>w</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> such that |w| = l>k, and let q0q<sup>1</sup> ...q<sup>l</sup> be the run of A<sup>k</sup> on w. The state q<sup>l</sup> is of the form q<sup>l</sup> = (w(1)...w(k),(i l 1,...,i<sup>l</sup> <sup>k</sup>)). It can be shown by induction on l that for each j we have i l <sup>j</sup> = − if and only if w is of the form w = w · <sup>w</sup> · <sup>w</sup> where <sup>w</sup> <sup>=</sup> <sup>w</sup>(1)...w(<sup>j</sup> <sup>−</sup> 1), <sup>w</sup> = (w(j)...w(k))<sup>k</sup> for some <sup>k</sup> <sup>≥</sup> 0, and w = (w(j)...w(i l <sup>j</sup> − 1)). Thus, if i l <sup>j</sup> = −, then it is possible to have a loop starting at position j, and i l <sup>j</sup> is such that (w(j)...w(i l <sup>j</sup> − 1)) is the prefix of w(j)...w(k) appearing after the (possibly empty) sequence of repetitions of w(j)...w(k). This means, that if i l <sup>j</sup> = −, then w is a prefix of the infinite word <sup>w</sup> · (w)<sup>ω</sup> <sup>∈</sup> <sup>L</sup><sup>I</sup> <sup>k</sup>(Σ<sup>ω</sup>). Therefore, if the run of <sup>A</sup><sup>k</sup> on a word <sup>w</sup> with <sup>|</sup>w<sup>|</sup> > k is accepting, then there exists <sup>σ</sup> <sup>∈</sup> <sup>L</sup><sup>I</sup> <sup>k</sup>(Σ<sup>ω</sup>) such that w<σ.

For the other direction, suppose that for each j, we have i l <sup>j</sup> = −. Take any j, and consider the first position m in the run q0q<sup>1</sup> ...q<sup>l</sup> where i m <sup>j</sup> = −. By the definition of δ we have that w(m) = w(i m−1 <sup>j</sup> ). This means that the prefix <sup>w</sup>(1)...w(m) cannot be extended to the word <sup>w</sup>(1)...w(<sup>j</sup> <sup>−</sup> 1)(w(j)...w(k))<sup>ω</sup>. Since for every j ∈ {1,...,k} we can find such a position m, it holds that there does not exist <sup>σ</sup> <sup>∈</sup> <sup>L</sup><sup>I</sup> <sup>k</sup>(Σ<sup>ω</sup>) such that w<σ. This concludes the proof. 

The automaton constructed in the previous theorem has size which is exponential in the length of the lassos. In the next theorem we show that this exponential blow-up is unavoidable. That is, we show that every nondeterministic finite automaton for the language *Prefix* (L<sup>I</sup> k(Σ<sup>ω</sup>)) is of size at least 2<sup>Ω</sup>(k) .

**Theorem 3.** *For any bound* <sup>k</sup> <sup>∈</sup> <sup>N</sup> *and sets of atomic propositions AP and* <sup>∅</sup> <sup>=</sup> <sup>I</sup> <sup>⊆</sup> *AP, every nondeterministic finite automaton* <sup>N</sup> *over the alphabet* <sup>Σ</sup> = 2*AP that recognizes* <sup>L</sup> <sup>=</sup> {<sup>w</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> | ∃<sup>σ</sup> <sup>∈</sup> <sup>L</sup><sup>I</sup> <sup>k</sup>(Σω).w<σ} *is of size at least* <sup>2</sup>Ω(k) *.*

*Proof.* Let N = (Q, Q0, δ, F) be a nondeterministic finite automaton for L. For each <sup>w</sup> <sup>∈</sup> <sup>Σ</sup>k, we have that <sup>w</sup> · <sup>w</sup> <sup>∈</sup> <sup>L</sup>. Therefore, for each <sup>w</sup> <sup>∈</sup> <sup>Σ</sup><sup>k</sup> there exists at least one accepting run ρ = q0q<sup>1</sup> ...q<sup>f</sup> of N on w · w. We denote with q(ρ, m) the state q<sup>m</sup> that appears at the position indexed m of a run ρ.

Let <sup>a</sup> <sup>∈</sup> <sup>2</sup><sup>I</sup> be a letter in 2<sup>I</sup> , and let <sup>Σ</sup> <sup>=</sup> <sup>Σ</sup>\{a <sup>∈</sup> <sup>Σ</sup> <sup>|</sup> <sup>a</sup> |<sup>I</sup> = a}. Let L ⊆ L be the language <sup>L</sup> <sup>=</sup> {<sup>w</sup> <sup>∈</sup> <sup>Σ</sup><sup>k</sup> | ∃w <sup>∈</sup> (Σ )<sup>k</sup>−1, a <sup>∈</sup> <sup>Σ</sup> : <sup>w</sup> <sup>=</sup> <sup>w</sup> · <sup>a</sup> and <sup>a</sup> |<sup>I</sup> = a}. That is, L consists of the words of length k in which letters a with a |<sup>I</sup> = a appear in the last position and only in the last position.

Let us define the set of states

Q<sup>k</sup> = {q(ρ, k) | ∃w ∈ L : ρ is an accepting run of N on w · w}.

That is, Q<sup>k</sup> consists of the states that appear at position k on some accepting run on some word w · w, where w is from L . We will show that <sup>|</sup>Qk| ≥ <sup>2</sup><sup>k</sup>−<sup>1</sup>.

Assume that this does not hold, i.e., <sup>|</sup>Qk<sup>|</sup> <sup>&</sup>lt; <sup>2</sup><sup>k</sup>−<sup>1</sup>. Since <sup>|</sup>L | ≥ <sup>2</sup><sup>k</sup>−<sup>1</sup>, this implies that there exist w1, w<sup>2</sup> ∈ L , such that w1|<sup>I</sup> = w2|<sup>I</sup> and there exists accepting runs ρ<sup>1</sup> and ρ<sup>2</sup> of N on w<sup>1</sup> · w<sup>1</sup> and w<sup>2</sup> · w<sup>2</sup> respectively, such that q(ρ1, k) = q(ρ2, k). That is, there must be two words in L with w1|<sup>I</sup> = w2|<sup>I</sup> , which have accepting runs on w<sup>1</sup> · w<sup>1</sup> and w<sup>2</sup> · w<sup>2</sup> visiting the same state at position k.

We now construct a run ρ1,<sup>2</sup> on the word w<sup>1</sup> · w<sup>2</sup> that follows ρ<sup>1</sup> for the first k steps on w1, ending in state q(ρ1, k), and from there on follows ρ<sup>2</sup> on w2. It is easy to see that ρ1,<sup>2</sup> is a run on the word w<sup>1</sup> · w2. The run is accepting, since ρ<sup>2</sup> is accepting. This means that w<sup>1</sup> · w<sup>2</sup> ∈ L, which we will show leads to contradiction.

To see this, recall that w<sup>1</sup> = w <sup>1</sup> · a and w<sup>2</sup> = w <sup>2</sup> · a, and w1|<sup>I</sup> = w2|<sup>I</sup> , and a |<sup>I</sup> = a|<sup>I</sup> = a. Since w<sup>1</sup> · w<sup>2</sup> ∈ L, we have that w <sup>1</sup> · a · w <sup>2</sup> · a < σ for some <sup>σ</sup> <sup>∈</sup> <sup>L</sup><sup>I</sup> <sup>k</sup>(Σ<sup>ω</sup>). That is, there exists a lasso for some word σ, and w <sup>1</sup> · a · w <sup>2</sup> · a is a prefix of this word. Since a does not appear in w <sup>2</sup>|<sup>I</sup> , this means that the loop in this lasso is the whole word w1|<sup>I</sup> , which is not possible, since w1|<sup>I</sup> = w2|<sup>I</sup> .

This is a contradiction, which shows that <sup>|</sup>Q|≥|Qk| ≥ <sup>2</sup><sup>k</sup>−<sup>1</sup>. Since <sup>N</sup> was an arbitrary nondeterministic finite automaton for L, this implies that the minimal automaton for <sup>L</sup> has at least 2<sup>Ω</sup>(k) states, which concludes the proof. 

Using the automaton from Theorem 2, we can transform every property automaton A into an automaton that accepts words representable by lassos of length less than or equal to k if and only if they are in L(A), and accepts all words that are not representable by lassos of length less than or equal to k.

**Theorem 4.** *Let AP be a set of atomic propositions, and let* I ⊆ *AP. For every (deterministic, nondeterministic or alternating) parity automaton* A *over* Σ = <sup>2</sup>*AP, and* <sup>k</sup> <sup>∈</sup> <sup>N</sup>*, there is a (deterministic, nondeterministic or alternating) parity automaton* <sup>A</sup> *of size* <sup>2</sup><sup>O</sup>(k) ·|A|*, s.t.,* <sup>L</sup>(A )=(L<sup>I</sup> k(Σ<sup>ω</sup>)∩L(A))∪(Σ<sup>ω</sup> \L<sup>I</sup> k(Σ<sup>ω</sup>))*.* *Proof.* The theorem is a consequence of Theorem 2 established as follows. Let <sup>A</sup> = (Q, Q0, δ, μ) be a parity automaton, and let <sup>D</sup> = (Q, <sup>q</sup><sup>0</sup>, δ,F ) be the deterministic finite automaton for bound k defined as in Theorem 2. We define the parity automaton A = (Q , Q 0, δ , μ ) with the following components:

– <sup>Q</sup> = (<sup>Q</sup> <sup>×</sup> <sup>Q</sup>); – Q <sup>0</sup> <sup>=</sup> {(q0, <sup>q</sup><sup>0</sup>) <sup>|</sup> <sup>q</sup><sup>0</sup> <sup>∈</sup> <sup>Q</sup>0} (when <sup>A</sup> is deterministic <sup>Q</sup> <sup>0</sup> is a singleton set); – δ ((q, <sup>q</sup>), α) = <sup>δ</sup>(q, α)[q-/(q-,δ -(q,α- ))], where δ(q, α)[q-/(q-,q--)] is the Boolean expression obtained from δ(q, α) by replacing every state q by the state (q , q );

– μ ((q, <sup>q</sup>)) = <sup>μ</sup>(q) if <sup>q</sup><sup>∈</sup> F, 0 if <sup>q</sup> <sup>∈</sup> F.

Intuitively, the automaton A is constructed as the product of A and D, where runs entering a state in D that is not accepting in D are accepting in A . To see this, recall from the construction in Theorem 2 that once D enters a state in <sup>Q</sup> \ <sup>F</sup> it remains in such a state forever. Thus, by setting the color of all states (q, <sup>q</sup>) where <sup>q</sup> <sup>∈</sup> <sup>F</sup> to 0, we ensure that words containing a prefix rejected by <sup>D</sup> have only runs in which the highest color appearing infinitely often is 0. Thus, we ensure that all words that are not representable by lassos of length less than or equal to k are accepted by A , while words representable by lassos of length less than or equal to k are accepted if and only if they are in L(A). 

The following theorem is a consequence of the one above, and provides us with an automata-theoretic approach to solving the lasso-precise synthesis problem.

**Theorem 5 (Synthesis).** *Let AP be a set of atomic propositions, and* I ⊆ *AP be a subset of AP consisting of the atomic propositions controlled by the environment. For a specification, given as a deterministic parity automaton* P *over the alphabet* <sup>Σ</sup> = 2*AP, and a bound* <sup>k</sup> <sup>∈</sup> <sup>N</sup>*, finding an implementation* <sup>T</sup> *, such that,* T |=k,I P *can be done in time polynomial in the size of the automaton* P *and exponential in the bound* k*.*

#### **5 Bounded Synthesis of Lasso-Precise Implementations**

For a specification ϕ given as an LTL formula, a bound n on the size of the synthesized implementation and a bound k on the lassos of input sequences, *bounded synthesis of lasso-precise implementations* searches for an implementation T of size n, such that T |=k,I ϕ. Using the automata constructions in the previous section we can construct a universal co-B¨uchi automaton for the language L<sup>I</sup> <sup>k</sup>(ϕ) <sup>∪</sup> (Σ<sup>ω</sup> \ <sup>L</sup><sup>I</sup> <sup>k</sup>(Σ<sup>ω</sup>)) and construct the constraint system as presented in [4]. This constraint system is exponential in both |ϕ| and k. In the following we show how the problem can be encoded as a quantified Boolean formula of size polynomial in |ϕ| and k.

**Theorem 6.** *For a specification given as an LTL formula* <sup>ϕ</sup>*, and bounds* <sup>k</sup> <sup>∈</sup> <sup>N</sup> *and* <sup>n</sup> <sup>∈</sup> <sup>N</sup>*, there exists a quantified Boolean formula* <sup>φ</sup>*, such that,* <sup>φ</sup> *is satisfiable if and only if there is a transition system* T = (T,t0,τ,o) *of size* n *with* T |=k,I ϕ*. The size of* <sup>φ</sup> *is in* <sup>O</sup>(|ϕ<sup>|</sup> <sup>+</sup> <sup>n</sup><sup>2</sup> <sup>+</sup> <sup>k</sup><sup>2</sup>)*. The number of variables of* <sup>φ</sup> *is equal to* <sup>n</sup> · (<sup>n</sup> · <sup>2</sup>|I<sup>|</sup> <sup>+</sup> <sup>|</sup>O|) + <sup>k</sup> · (|I<sup>|</sup> + 1) + <sup>n</sup> · <sup>k</sup>(|O<sup>|</sup> <sup>+</sup> <sup>n</sup> + 1)*.*

*Construction.* We encode the bounded synthesis problem in the following quantified Boolean formula:

$$\exists \{ \tau\_{t,i,t'} \mid t, t' \in T, i \in 2^I \} . \exists \{ o\_t \mid t \in T, o \in O \} . \tag{1}$$

$$\forall \{i\_j \mid i \in I, 0 \le j < k\}. \; \forall \{l\_j \mid 0 \le j < k\}. \tag{2}$$


$$\forall \{l'\_j \mid 0 \le j < n \cdot k\}. \tag{5}$$

$$
\varphi\_{\text{det}} \land \left( \varphi\_{\text{lasso}} \land \varphi\_{\in T}^{n,k} \to \|\varphi\|\_{0}^{k,n\cdot k} \right) \tag{6}
$$

which we read as: there is a transition system (1), such that, for all input sequences representable by lassos of length k (2) the corresponding sequence of outputs of the system (3) satisfies ϕ. The variables introduced in lines (4) and (5) are necessary to encode the corresponding output for the chosen input lasso.

An assignment to the variables satisfies the formula in line (6), if it represents a deterministic transition system (ϕdet) in which lassos of length <sup>n</sup>·<sup>k</sup> (ϕlasso∧ϕn,k ∈T ) satisfy the property ϕ (ϕ (k,n·k)

<sup>0</sup> )). These constraints are defined as follows. ϕdet: A transition system is deterministic if for each state t and input i there is exactly one transition τt,i,t to some state t : - t∈T - i∈2<sup>I</sup> t-∈T (τt,i,t- ∧ - t-=t-τt,i,t--).

ϕn,k ∈T : for a certain input lasso of size <sup>k</sup> we can match a lasso in the system of size at most n · k. A lasso of this size in the transition system matches the input lasso if the following constraints are satisfied.

$$\bigwedge\_{0 \le j < n \cdot k} \bigwedge\_{t \in T} (t\_j \to \bigwedge\_{o \in O} (o\_j \leftrightarrow o\_{t\_j})) \tag{7}$$

$$\bigwedge\limits\_{0\le j$$

$$\land \bigwedge\_{\substack{i \in 2^I, t, t' \in T \\ i \le j' < k}} ((\bigwedge\_{\substack{k \le j' < k}} l\_{j'} \to i\_{\Delta(n \cdot k - 1, k, j')}) \land t\_{n \cdot k - 1} \to (\tau\_{t, i, t'} \leftrightarrow (\bigvee\_{\substack{l \le j < n \cdot k}} l'\_j \land t'\_j))) \tag{10}$$

Lines (9) and (10) make sure that the chosen lasso follows the guessed transition relation τ . Line (10) handles the loop transition of the lasso, and makes sure that the loop of the lasso follows τ . Line (7) is a necessary requirement in order to match the output produced on the lasso with ϕ. If the output variables o<sup>j</sup> satisfy the constraint ϕ (k,n·k) <sup>0</sup> , then the lasso satisfies ϕ. As the input lasso is smaller than its matching lasso in the system we need to make sure that the indices of the input variables are correct with respect to the chosen loop. This is computed using the function Δ which is given by:

$$\Delta(j,k,j') = \begin{cases} j & \text{if } j < k, \\ ((j-k) \mod (k-j')) + j' & \text{otherwise.} \end{cases}$$

ϕlasso: The formula encodes the additional constraint that exactly one of the loop variables can be true for a given variable valuation.

ϕ k,m <sup>0</sup> : This constraint encodes the satisfaction of ϕ on lassos of size m. The encoding is similar to the encoding of bounded model checking [3], with the distinction of encoding the satisfaction relation of the atomic propositions, given below. As the inputs run with different indices than the outputs, we again, as in the lines (9) and (10), need to compute the correct indices using the function Δ.


#### **6 Synthesis of Approximate Implementations**

In some cases, specifications remain unrealizable even when considered under bounded environments. Nevertheless, one might still be able to construct implementations that satisfy the specification in almost all input sequences of the environment. Consider for example the following simplified arbiter specification:

$$
\Box(\overline{w}\to\bigcirc\overline{g})\land\Box(r\to\bigcirc g),
$$

The specification defines an arbiter that should give grants g upon requests r, but is not allowed to provide these grants unless a signal w is true. The specification is unrealizable, because a sequence of inputs where the signal w is always false prevents the arbiter from answering any request. Bounding the environment does not help in this case as a lasso of size 1 already suffices to violate the specification (the one where w is always false). Nevertheless, one can still find reasonable implementations that satisfy the specification for a large fraction of input sequences. In particular, the fraction of input sequences where w remains false forever is less probable.

**Definition 3 (-**k**-Approximation).** *For a specification* ϕ*, a bound* k*, and an error rate , we say that a transition system* T *approximately satisfies* ϕ *with an error rate for lassos of length at most* <sup>k</sup>*, denoted by* T |<sup>=</sup> k,I ϕ*, if and only if,* |{σ|σ∈L<sup>I</sup> <sup>k</sup>(L(T )),σ|=ϕ}| <sup>|</sup>L<sup>I</sup> <sup>k</sup>((2<sup>I</sup> )ω)<sup>|</sup> <sup>≥</sup> <sup>1</sup> <sup>−</sup> *. We call* <sup>T</sup> *an -*k*-approximation of* <sup>ϕ</sup>*.*

**Theorem 7.** *For a specification given as a deterministic parity automaton* P*, a bound* k *and a error rate* 0 ≤ ≤ 1*, checking whether there is an implementation* <sup>T</sup> *, such that,* T |<sup>=</sup> k,I P *can be done in time polynomial in* |P| *and exponential in* k*.*

*Proof.* For a given and k, we construct a nondeterministic parity tree automaton N that accepts all -k-approximations with respect to L(P). For , we can compute the minimal number m of lassos from L<sup>I</sup> <sup>k</sup>((2<sup>I</sup> )<sup>ω</sup>) for which an -kapproximation has to satisfy the specification. In its initial state, the automaton N guesses m many lassos and accepts a transition system if it does not violate the specification on any of these lassos. The latter check is done by following the structure of the automaton constructed for P using Theorem 4. In order to check whether there is an -k-approximation for P, we solve the emptiness game of <sup>N</sup> . The size of <sup>N</sup> is (2<sup>k</sup>)<sup>m</sup>+1 · |P|. 

#### **6.1 Symbolic Approach**

In the following, we present a symbolic approach for finding -k-approximations based on maximum model counting. We show that we can build a constraint system and apply a maximum model counting algorithm to compute a transition system that satisfies a specification for a maximum number of input sequences.

**Definition 4 (Maximum Model Counting** [5]**).** *Let* X, Y *and* Z *be sets of propositional variables and* φ *be a formula over* X, Y *and* Z*. Let* x *denote an assignment to* X*,* y *an assignment to* Y *, and* z *an assignment to* Z*. The maximum model counting problem for* φ *over* X *and* Y *is computing a solution for* max<sup>x</sup> #y.∃z.φ(x, y, z).

For a specification ϕ, bounds k and n on the length of the lassos and size of the system, respectively, we can compute an -k-approximation for ϕ by applying a maximum model counting algorithm to the constraint system given below. It encodes transition systems of size n that have an input lasso of length k that satisfies ϕ.

$$\exists \{ \tau\_{t,i,t'} \mid t, t' \in T, i \in 2^I \} . \exists \{ o\_t \mid t \in T, o \in O \} . \tag{11}$$

$$\exists \{i\_j \mid i \in I, 0 \le j < k\}. \exists \{l\_j \mid 0 \le j < k\}. \tag{12}$$


$$
\varphi\_{\text{det}} \wedge \varphi\_{\text{lasso}} \wedge \varphi\_{\in T}^{n,k} \wedge \|\varphi\|\_{0}^{k,n\cdot k} \wedge \|k\|\_{0} \tag{17}
$$

To check the existence of a -k-approximation, we maximize over the set of assignment to variables that define the transition system (line 11) and count over variables that define input sequences of the environment given by lassos of length k. As two input lassos of the same length may induce the same infinite input sequence, we count over auxiliary variables that represent unrollings of the lassos instead of counting over the input propositions themselves (line 13).

The formulas ϕdet, ϕlasso, ϕn,k ∈T and ϕ k,n·k <sup>0</sup> are defined as in the previous section. The formula k<sup>0</sup> is defined over that variables in line (13) and makes sure that input lasso that represent the same infinite sequence are not counted twice by unrolling the lasso to size 2k.

**Theorem 8.** *For a specification given as an LTL formula* ϕ*, and bounds* k *and* n*, and an error rate , the propositional formula* φ *defined above is of size* O(|ϕ|+ <sup>n</sup><sup>2</sup> <sup>+</sup> <sup>k</sup><sup>2</sup>)*. The number of variables of* <sup>φ</sup> *is equal to* <sup>n</sup> ·(<sup>n</sup> · <sup>2</sup>|I<sup>|</sup> <sup>+</sup> <sup>|</sup>O|) + <sup>k</sup> ·(<sup>k</sup> · |I<sup>|</sup> <sup>+</sup> |I| + 1) + n · k(|O| + n + 1)*.*

#### **7 Experimental Results**

We implemented the symbolic encodings for the exact and approximate synthesis methods, and evaluated our approach on a bounded version of the greedy arbiter specification given in Sect. 1, and another specification of a round-robin arbiter. The round-robin arbiter is defined by the specification:

$$
\Box \bigotimes w \to \Box \bigotimes g\_1 \wedge \Box \bigotimes g\_2 \wedge \Box (\neg w \to \bigotimes (\neg g\_1 \wedge \neg g\_2)) \wedge \Box (\neg g\_1 \vee \neg g\_2)
$$

This specification is realizable, with transition systems of size at least 4. We used our implementation to check whether we can find approximative solutions with smaller sizes. We used the tool CAQE [10] for solving the QBF instances and the tool MaxCount [5] for solving the approximate synthesis instances.

**Table 1.** Experimental results for the symbolic approaches. The rate in the approximate approach is the rate of input lassos on which the specification is satisfied.


The results are presented in Table 1. As usual in synthesis, the size of the instances grows quickly as the size bound and number of processes increase. Inspecting the encoding constraints shows that the constraint for the specification is responsible for more than 80% of the number of gates in the encoding. The results show that, using the approach we proposed, we can synthesize implementations for unrealizable specifications by bounding the environment. The results for the approximate synthesis method further demonstrate that for the unrealizable cases one can still obtain approximative implementations that satisfy the specification on a large number of input sequences.

#### **8 Conclusion**

In many cases, the unrealizability of a specification is due to the assumption that the environment has unlimited power in producing inputs to the system. In this paper, we have investigated the problem of synthesizing implementations under bounded environment behaviors. We have presented algorithms for solving the synthesis problem for bounded lassos and the synthesis of approximate implementations that satisfy the specification up to a certain rate.

We have also provided polynomial encodings of the problems into quantified Boolean formulas and maximum model counting instances. Our experiments demonstrate the principal feasibility of the approach. Our experiments also show that the instances can quickly become large. While this is a common phenomenon for synthesis, there clearly is a lot of room for optimization and experimentation with both the solvers for quantified Boolean expressions and for maximum model counting.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Quantified Invariants via Syntax-Guided Synthesis**

Grigory Fedyukovich1(B) , Sumanth Prabhu<sup>2</sup>, Kumar Madhukar<sup>2</sup>, and Aarti Gupta<sup>1</sup>

<sup>1</sup> Princeton University, Princeton, USA {grigoryf,aartig}@cs.princeton.edu <sup>2</sup> TCS Research, Pune, India {sumanth.prabhu,kumar.madhukar}@tcs.com

**Abstract.** Programs with arrays are ubiquitous. Automated reasoning about arrays necessitates discovering properties about ranges of elements at certain program points. Such properties are formally specified by universally quantified formulas, which are difficult to find, and difficult to prove inductive. In this paper, we propose an algorithm based on an enumerative search that discovers quantified invariants in stages. First, by exploiting the program syntax, it identifies ranges of elements accessed in each loop. Second, it identifies potentially useful facts about individual elements and generalizes them to hypotheses about entire ranges. Finally, by applying recent advances of SMT solving, the algorithm filters out wrong hypotheses. The combination of properties is often enough to prove that the program meets a safety specification. The algorithm has been implemented in a solver for Constrained Horn Clauses, Freq-Horn, and extended to deal with multiple (possibly nested) loops. We show that FreqHorn advances state-of-the-art on a wide range of public array-handling programs.

#### **1 Introduction**

Formally verifying programs against safety specifications is difficult. This problem worsens in the presence of data structures like lists, arrays, and maps, which are ubiquitous in real-world applications. For instance, proving an array-handling program safe often requires discovering an inductive invariant that is universally quantified over ranges of array elements. Such invariants help to prove the unreachability of error states independently of the size of the array. However, the majority of invariant synthesis approaches are limited to quantifier-free numerical invariants. The approach presented in this paper advances the knowledge by an effective technique to discover quantified invariants over arrays and linear integer arithmetic.

Syntax-guided techniques [3] have recently been applied to synthesize quantifier-free numerical invariants [15–17,34] in the approach called Freq-Horn. In a nutshell, FreqHorn collects various statistics from the syntactical patterns occurring in the program's source code and uses them to construct a set of formal grammars that specify a search space for invariants. It is often sufficient to perform an *enumerative search* over the formulas produced from these grammars and identify a set of suitable inductive invariants among them using an off-the-shelf solver for Satisfiability Modulo Theories (SMT). The presence of arrays complicates this reasoning in a few respects: it is hard to find suitable candidates and difficult to prove them inductive.

In this paper, we present a novel technique that extends the approach of enumerative search in general, and its instantiation in FreqHorn in particular, to reason about quantifiers. It discovers invariants over arrays in multiple stages. First, by exploiting the program syntax, it identifies ranges of elements accessed in each loop. Second, it identifies potentially useful facts about individual elements and generalizes them to hypotheses about entire ranges. The SMT-based validation of candidates, which are quantified formulas, is often inexpensive as they are constructed using the same syntactic patterns that appear in the source code. Furthermore, for supporting certain corner cases, our approach allows specifying additional rules that help in generalizing learned properties. The combination of properties proven inductive by an SMT solver is often enough to prove that the program meets a safety specification.

We show that FreqHorn advances state-of-the-art on a selection of arrayhandling programs from SVCOMP<sup>1</sup> and literature. For instance, it can prove completely automatically that an array is monotone after applying a sorting algorithm. Furthermore, FreqHorn is able to discover quantifier-free invariants over integer variables in the program, use them as inductive relatives while checking inductiveness of quantified candidates over arrays; and vice versa.

While a detailed discussion of the related work comes later in the paper (Sect. 6), it is noteworthy that being syntax-guided crucially helps us overcome several limitations of other techniques to verify array-handling programs [2,9, 11,35]. Most of them avoid inferring quantified invariants explicitly and thus do not produce checkable proofs. As a result, tools are fragile and in practice often output false positives (see Sect. 5 for concrete results). By comparison, our approach never produces false positives, and its results can be validated by existing SMT solvers.

The core contributions made through this work are:


The rest of the paper is structured as follows. In Sect. 2, we give background and notation and illustrate our approach on an example. Our main contributions are then presented in Sect. 3 (main algorithm) and Sect. 4 (important design choices). In Sect. 5, we show the evaluation and comparison with state-of-theart. Finally, the related work and conclusion complete the paper in Sects. 6 and 7, respectively.

<sup>1</sup> Software Verification Competition, http://sv-comp.sosy-lab.org/.

#### **2 Background**

The Satisfiability Modulo Theories (SMT) task is to decide whether there is an assignment m of values to variables in a first-order logic formula ϕ that makes it true. We write <sup>ϕ</sup> <sup>=</sup><sup>⇒</sup> <sup>ψ</sup>, if every satisfying assignment to <sup>ϕ</sup> is also a satisfying assignment to some formula ψ. By *Expr* we denote the space of all possible quantifier-free formulas in our background theory and by *Vars* a range of possible variables.

#### **2.1 Programs as Constrained Horn Clauses**

To guarantee expected behaviors, programs require proofs, such as inductive invariants, ranking functions, or recurrence sets. It is becoming increasingly popular to consider a verification task as a *proof synthesis* task which is formulated as a system of SMT formulas involving unknown predicates, also known as *constrained Horn clauses* (CHC). The synthesis goal is to discover a suitable interpretation of all unknown predicates that make all CHCs true. CHCs offer the advantages of flexibility and modularity in designing verifiers for various systems and languages. CHCs can be constructed in a way that captures the operational semantics of a language in question, and an off-the-shelf CHC solver can be used for solving the resulting formulas.

**Definition 1.** *A* linear constrained Horn clause *(CHC) over a set of uninterpreted relation symbols R is a formula in first-order logic that has the form of one of three implications (called respectively a* fact*, an* inductive clause*, and a* query*):*

$$\begin{aligned} \varphi(\vec{x\_1}) &\implies \mathit{inv}\_1(\vec{x\_1})\\ \mathit{inv}\_1(\vec{x\_1}) \land \varphi(\vec{x\_1}, \vec{x\_2}) &\implies \mathit{inv}\_2(\vec{x\_2})\\ \mathit{inv}\_1(\vec{x\_1}) \land \varphi(\vec{x\_1}) &\implies \bot \end{aligned}$$

*where inv*1, *inv*<sup>2</sup> <sup>∈</sup> *<sup>R</sup> are uninterpreted symbols,* x1, x<sup>2</sup> *are vectors of variables, and* ϕ*, called a* body*, is a fully interpreted formula (i.e.,* ϕ *does not have applications of inv*<sup>1</sup> *or inv*2*).*

For a CHC <sup>C</sup>, by *src*(C) we denote an application of *inv* <sup>∈</sup> *<sup>R</sup>* in the premise of C (if C is a fact, we write *src*(C) def <sup>=</sup> ). Similarly, by *dst*(C) we denote an application of *inv* <sup>∈</sup> *<sup>R</sup>* in the conclusion of <sup>C</sup> (if <sup>C</sup> is a query, we write *dst*(C) def <sup>=</sup> <sup>⊥</sup>). We define functions *rel* and *args*, such that for each *inv*(x), *rel*(*inv*(x)) def = *inv* and *args*(*inv*(x)) def = x. For a CHC C, by *body*(C) we denote the body (i.e., ϕ) of C.

*Example 1.* Figure 1 gives a program in the C programming language that handles two integer arrays, A and B, both of an unknown size N. The A array has unknown content, and the program first identifies a value m which is smaller or

```
int N = nondetInt ();
int *A = nondetArray (N);
int m = 0;
for (int i = N - 1; i ≥ 0; i --) { if (m > A[i]) m = A[i]; }
int *B = malloc (N* sizeof (int ));
for (int i = 0; i < N; i++) { B[N - i - 1] = A[i] - m; }
int s = 0;
for (int i = 0; i < N; i++) { s = s + B[i]; }
assert (s ≥ 0);
```
**Fig. 1.** Example program: source code in C.

**Fig. 2.** Example program: CHC encoding.

equal to all elements of A (it might be either a minimal element among the content of A or 0). Then, the program populates B by values of A with m subtracted. Interestingly, the order of elements A and B is not preserved, e.g., A[0] - m gets written to B[N - 1], and so on. Finally, the program computes the sum s of all elements in B and requires us to prove that s is never negative.

Figure 2 gives a CHC encoding of the program. The system has three uninterpreted predicates, *inv*1, *inv*2, and *inv*<sup>3</sup> corresponding to invariants at heads of the three loops. The primed variables correspond to modified variables. Rules **B**, **D**, and **F** encode the loop bodies, and the remaining rules encode the fragments of code before, after, or between the loops. In particular, rule **G** ensures that after the third loop has terminated, a program state with a negative value of s is unreachable. Before we describe how our technique solves this CHC system (see Sect. 2.2), we briefly introduce the notion of satisfiability of CHCs.

**Definition 2.** *Given a set of uninterpreted relation symbols R and a set* S *of CHCs over R , we say that* S *is satisfiable if there exists an interpretation that assigns to each* <sup>n</sup>*-ary symbol inv* <sup>∈</sup> *<sup>R</sup> a relation over* <sup>n</sup>*-tuples and makes all implications in* S *valid.*

In the paper, we assume that a relation assigned by an interpretation is represented by a formula ψ over at most n free variables.

We call a CHC C inductive when *rel*(*src*(C)) = *rel*(*dst*(C)) = *inv* for some *inv*. While accessing an array in a loop, we assume the existence of an integer counter variable. More formally:

**Definition 3.** *Let* C *be an inductive CHC,* x = *args*(*src*(C))*, and* x = *args*(*dst*(C))*. We say that* C *is* array-handling *if there exist numbers* c *and* <sup>a</sup>*, such that* (1) 1 <sup>≤</sup> <sup>c</sup> ≤ |x<sup>|</sup> *and* <sup>1</sup> <sup>≤</sup> <sup>a</sup> ≤ |x|*;* (2) x[c] *(and consequently, its "primed copy"* x [c]*) has type integer,* (3) *either of these implications holds:*

$$(body(C) \implies \vec{x}[c] < \vec{x}'[c] \tag{1}$$

$$(body(C) \implies \vec{x}[c] > \vec{x}'[c] \tag{2}$$

(4) x[a] *(and consequently* x [a]*) has type array, and* (5) *there is an* access function f *that identifies a relationship between an access to* x[a] *in body*(C) *and* x[c]*.*

#### **2.2 Illustrating Example**

The CHC system in Fig. 2 has a solution, indicating that the program meets its specification. In particular:

$$\begin{aligned} \mathit{inv}\_1 &\mapsto \forall j \,\,. i < j < N \implies m \le A[j] \\ \mathit{inv}\_2 &\mapsto \forall j \,\,. 0 \le j < N \implies m \le A[j] \land \\ \forall j \,\,. 0 \le j < i \implies B[N - j - 1] &= A[j] - m \\ \mathit{inv}\_3 &\mapsto \forall j \,\,. 0 \le j < N \implies m \le A[j] \land \\ \forall j \,\,. 0 \le j < N &\implies B[N - j - 1] = A[j] - m \\ \wedge \;\, s \ge 0 \end{aligned}$$

The interpretation of *inv*<sup>1</sup> means that as the first loop progresses (i.e, all elements <sup>A</sup>[<sup>N</sup> <sup>−</sup>1], <sup>A</sup>[<sup>N</sup> <sup>−</sup>2], . . . , <sup>A</sup>[i+ 1] are sequentially considered), the value of m is always smaller than all the considered elements. Thus, we refer to the interpretation of *inv*<sup>1</sup> as a *progress lemma*. When the first loop has terminated, clearly, this property holds for all elements from <sup>A</sup>[0] to <sup>A</sup>[<sup>N</sup> <sup>−</sup> 1]. Because <sup>A</sup> leaks through the second loop without any changes, the interpretation of *inv*<sup>1</sup> gets finalized (thus, it becomes a *finalized lemma*) and added to an interpretation of *inv*2.

Additionally, the interpretation of *inv*<sup>2</sup> gets a relational fact about pairs of elements <sup>A</sup>[0] and <sup>B</sup>[N−1], <sup>A</sup>[1] and <sup>B</sup>[N−2], . . . , <sup>A</sup>[i−1] and <sup>B</sup>[N−i−2], which again appears as a progress lemma and then gets finalized in an interpretation of *inv*3. With these two quantified invariants about all elements of A, and relation about pairs of elements of A and B, it is possible to derive the remaining lemma in the interpretation of *inv*3, namely, <sup>s</sup> <sup>≥</sup> 0; which concludes the proof.

#### **3 Invariants via Enumerative Search**

In this work, we aim at discovering a solution for a CHC system S over a set of uninterpreted symbols *R* enumeratively, i.e., by guessing a candidate formula for each *inv* <sup>∈</sup> *<sup>R</sup>* , substituting it for all CHCs <sup>C</sup> <sup>∈</sup> <sup>S</sup> and checking their validity.

#### **3.1 Quantifier-Free Invariants**

We build on top of an algorithm, called FreqHorn, recently proposed in [17]. Its key insight is an automatic construction of a set of formal grammars G(*inv*) for each *inv* ∈ *R* based on either source code, program behaviors, or both. Importantly, these grammars are *conjunction-free*: they cannot be used to produce a conjunction of clauses and can give rise to only a finite number of formulas, potentially related to invariants (otherwise, the approach does not guarantee strong convergence). Since invariants are often represented by a conjunction of lemmas, FreqHorn attempts to sample (i.e., recursively apply production rules) each lemma from a grammar in separation, until a combination of them is sufficient for the inductiveness and safety, or a search space is exhausted. Freq-Horn relies on an SMT solver to filter out unsuccessfully sampled lemmas.

The construction of formal grammars is biased by the syntax of CHC encoding. First, FreqHorn collects a set of *Seeds* by converting the body of each CHC to a Conjunctive Normal Form, extracting, and normalizing each conjunct. Then, the set of seeds could be optionally replenished by a set of *behavioral seeds* and *bounded proofs*. They are constructed respectively from the concrete values of variables obtained from actual program runs, and Craig interpolants from unsatisfiable finite unrollings of the CHC systems. Finally, the production rules are created in a way to enable producing seeds and also their *mutants* (i.e., syntactically similar formulas to seeds). In general, no specific restriction on a grammar-construction method is imposed; so in practice, the grammars are allowed to be more (or less) general to enable a broader (or more focused) search space for invariants.

#### **3.2 Quantified Candidates from Quantifier-Free Grammars**

The main obstacle for applying the enumerative search to generate array invariants is that the grammars do not allow quantifiers. Because grammars are constructed automatically from syntactic patterns which appear in the original programs, in the presence of arrays, we can expect expressions involving only particular elements of arrays (such as ones accessed via a loop counter). However, since each loop repeats certain operations over a *range* of array elements, we have to *generalize* the extracted expressions about individual elements to expressions about entire ranges.

Let a set of variables associated with a relation symbol *inv* be *Vars*(*inv*) def = *IntVars*(*inv*) ∪ *ArrVars*(*inv*), where *IntVars*(*inv*) and *ArrVars*(*inv*) are disjoint and contain integer variables and array variables, respectively. A candidate quantified invariant over arrays consists of three parts:


**Algorithm 1.** Prepare(S, *R* ) **Input**: CHCs S over *R* **Output**: Formal grammars G(*inv*), quantified variables *QVars*(*inv*) and *progressRange*(*inv*) for each *inv* ∈ *R* **for each** *inv* ∈ *R* **do** *Seeds* <sup>←</sup> SyntSeeds(*inv*) <sup>∪</sup> BehavSeeds(*inv*); cnt <sup>←</sup> getCounters(S, *inv*, *ArrVars*(*inv*)); **if** <sup>∅</sup> <sup>=</sup> cnt **then** *QVars*(*inv*) <sup>←</sup> copy(cnt); *progressRange*(*inv*) <sup>←</sup> getRange(cnt); <sup>G</sup>(*inv*) <sup>←</sup> Replace(getGrammar(*Seeds*), cnt, *QVars*(*inv*));

**Algorithm 2.** SolveArrayCHCs(S, *R* )

**Input**: CHCs S over *R* **Output**: *res* ∈ {sat, unknown}, *Lemmas* : *<sup>R</sup>* <sup>→</sup> <sup>2</sup>*Expr* **<sup>1</sup>** G, *QVars*, *progressRange* ← Prepare(S, *<sup>R</sup>* ); **<sup>2</sup> for each** *inv* <sup>∈</sup> *<sup>R</sup>* **do** *Lemmas*(*inv*) <sup>←</sup> <sup>∅</sup>; **<sup>3</sup> while** <sup>∃</sup><sup>C</sup> <sup>∈</sup> S . - -∈*Lemmas*(*rel*(*src*(C))) -(*args*(*src*(C))) <sup>∧</sup> *body*(C) <sup>=</sup>⇒ ⊥ **do <sup>4</sup> if** <sup>∀</sup>*inv* <sup>∈</sup> *<sup>R</sup>* . allBlocked(G(*inv*)) **then return** unknown, <sup>∅</sup>; **<sup>5</sup>** *inv* <sup>←</sup> pickLoop(*<sup>R</sup>* ); **<sup>6</sup> if** *QVars*(*inv*) = <sup>∅</sup> **then** *Cand*(*inv*) <sup>←</sup> sample(G(*inv*)); **<sup>7</sup> else** *Cand*(*inv*) ← ∀*QVars*(*inv*) . *QVars*(*inv*) <sup>∈</sup> *progressRange*(*inv*) =<sup>⇒</sup> sample(G(*inv*)); **<sup>8</sup>** Ext*Cand* <sup>←</sup> extend(S, {*inv*}, *Cand*, *Lemmas*); **<sup>9</sup> if** ∀*inv* ∈ *R* . Ext*Cand*(*inv* ) = **then** <sup>G</sup>(*inv*) <sup>←</sup> block(G, *Cand*, *inv*); **10 else <sup>11</sup> for each** *inv* ∈ *R* **do 12** *Lemmas*(*inv* ) ← *Lemmas*(*inv* ) ∪ {Ext*Cand*(*inv* )}; **13** G(*inv* ) <sup>←</sup> block(G, Ext*Cand*, *inv* ); **<sup>14</sup> return** sat, *Lemmas*;

A naive idea for getting a range formula and a cell property is to sample them separately, and then to bind them together using some *QVars*(*inv*). But it would result in a large search space. Algorithm 1 gives a more tailored procedure on the matter. The central role in this process is taken by an analysis of the loop counters which are used to access array elements (line 3). This analysis is performed once for each loop before the main verification process, and thus its results are reused in all iterations of the verification process.

Our algorithm identifies *QVars*(*inv*) by creating a fresh variable for each counter, including counters of nested loops (line 5). It then generates range formulas based on the results of the analysis (line 6) such that: (1) the range formula itself is an inductive invariant for *inv*, and (2) the range formula is expressed over the initial values of counters of *inv* and the counters themselves. Finally, only a cell property is going to be produced from the grammar G(*inv*),

#### **Algorithm 3.** weaken(S , *R* , *Cand*, *Lemmas*)

```
Input: CHCs S over R 
                       , candidates Cand(inv); learned Lemmas(inv) for
         each inv ∈ R 
  Output: weakened Cand
1 toRecheck ← ⊥;
2 for all C ∈ S do
3 if 
        -
        ∈Lemmas(rel(src(C)))
                        -
                        (args(src(C))) ∧ Cand(rel (src(C)))(args(src(C))) ∧
      body(C) =⇒ Cand(rel (dst(C)))(args(dst(C))) then
4 if isFinalizedArrayCand(Cand, rel (dst(C))) then
5 Cand(rel (dst(C)))) ← getRegressCand(Cand, rel (dst(C)));
6 else
7 Cand(rel (dst(C))) ← ;
8 toRecheck ← ;
9 break;
10 if toRecheck then return weaken(S
                                   , R 
                                      , Cand, Lemmas);
11 else return Cand;
```


constructed from the seeds (recall Sect. 3.1), in which all counters are replaced by the corresponding variables from *QVars*(*inv*) (line 7). Thus, the only part of the candidate formula where the counter can appear is the range formula.

Once grammars, *QVars*, and ranges are detected, our approach proceeds to sample candidates and to check them with an SMT solver. The general flow of this algorithm is illustrated in Algorithm 2. For each *inv* ∈ *R* , it initiates a set *Lemmas*(*inv*) (line 2). Then it iteratively guesses lemmas until a combination of them is inductive and safe, or a search space is exhausted (lines 3–4).

Compared to the baseline approach from [17], our new algorithm fixes a shape for the candidates for arrays. At the same time, it permits to sample quantifier-free candidates (line 6): they could be either formulas over counters or any other variables in the loop, or even formulas over isolated array elements (if, e.g., accessed by a constant). Then (line 8), Algorithm 2 propagates candidates through all available implications in CHCs using quantifier elimination and identifies lemmas among the candidates. This step is similar to the baseline approach from [17], but for completeness of presentation, we provide the pseudocode in Algorithms 3 and 4. The only differences are (1) in the implementation of the candidate propagation for array candidates and (2) in the weakening of failed candidates (both in Algorithm 3, to be discussed in Sects. 4.3 and 4.4, respectively).

Both successful and unsuccessful candidates are "blocked" from their grammars to avoid re-sampling them in the next iterations. This fact together with the property of grammars being conjunction-free gives the main hint for proving the following theorem.

**Theorem 1.** *Algorithm 2 always makes a finite number of iterations, and if it returns with* SAT *then the CHC system is satisfiable.*

Next section discusses a particular instantiation of important subroutines that make our invariant synthesizer effective in practice.

#### **4 Design Choices**

Our main contribution is a completely automated algorithm for finding quantified invariants for array-handling loops. In this section, we first show how by exploiting the program syntax we can identify ranges of elements accessed in each loop (Sect. 4.1). Second, we present an intuitive justification to why our candidates can often be proved as lemmas by an off-the-shelf SMT solver (Sect. 4.2). Finally, we extend our algorithm to handle more complicated cases of multiple loops (Sects. 4.3–4.4), and benchmarks of the tiling [9] technique, which are adapted from the industrial code of battery controllers (Sect. 4.5).

#### **4.1 Discovery of Progress Lemmas**

We start with the simplest scenario of a single loop handling just one array. Let S be a system of CHCs over a set of uninterpreted relation symbols *R* . Let *inv* ∈ *R* correspond to a loop, in which arrays are accessed using some counter variable i (counters are automatically identified by posing and solving queries of forms (1) and (2)).

Recall that we do not necessarily require the array elements to be accessed directly by i, and we allow an access function f to identify relationships between i and an index of the accessed element. However, we assume that the counter is unique in the loop because it is the case in most of the practical applications. In principle, our algorithm can be extended to loops handling several independent counters (although it is rare in practice), with the help of additionally discovered lemmas that describe relationships among counters. We leave a discussion about this to future work.

**Definition 4.** *A* range *of inv and a counter* i *is a formula over IntVars*(*inv*) *and a free variable* <sup>v</sup> *having form* L<v <sup>∧</sup> v<U*, such that either of formulas* L<i *or* i<U *is a lemma for inv. A* progress lemma *is either a formula* L<v <sup>∧</sup> v<i *(if* L<i *is a lemma), or a formula* i<v <sup>∧</sup> v<U *(if* i<U *is a lemma).*

Both ranges and progress ranges can be identified statically. Let C<sup>1</sup> and C<sup>2</sup> be two CHCs, such that *inv* = *rel*(*dst*(C1)) = *rel*(*src*(C2)) = *rel*(*dst*(C2)) and *inv* <sup>=</sup> *rel*(*src*(C1)). It is common in practice that body(C1) identifies a symbolic bound b on the initial value of i: it could be either a lower bound (if i increments in body(C2)) or an upper bound (if i decrements). In this case, a progress range of *inv* is simply computed as a lemma for *inv* over i and b. A range of *inv* can often be constructed as a conjunction of the progress range with the negation of the termination condition of body(C2).<sup>2</sup>

*Example 2.* For the CHC-encoding of the program is shown in Fig. 2, the ranges of *inv*1, *inv*<sup>2</sup> and *inv*<sup>3</sup> are all equal to <sup>−</sup><sup>1</sup> <v<N. The progress range of *inv*<sup>1</sup> is i<v<N, and the progress ranges of *inv*<sup>2</sup> and *inv*<sup>3</sup> are <sup>−</sup><sup>1</sup> <v<i.

We call candidates, that use progress ranges in their left sides, *progress candidates*:

$$\forall \vec{q} . \; progressRange (in \, \vec{v} \,) (\vec{q}) \implies cand$$

where q <sup>=</sup> *QVars*(*inv*) and cand is a quantifier-free formula over *QVars*(*inv*)<sup>∪</sup> *IntVars*(*inv*). As can be seen from Algorithm 1, all sampled candidates are progress candidates. However, during the next steps of the algorithm (i.e., propagation and weakening) we will use other kind of candidates (namely, *regress* and *finalized*, see Sects. 4.3 and 4.4 respectively).

If a progress candidate is proven inductive, we call it a *progress lemma*.

#### **4.2 SMT-Based Inductiveness Checking**

We rely on recent advances of SMT solving to identify successful candidates, a conjunction of which is directly used to prove the desired safety specification. In general, solving quantified formulas for validity is a hard task, however, in certain cases, the initiation and inductiveness queries can be simplified and reduced to a sequence of (sometimes even quantifier-free) formulas over integer arithmetic. We illustrate such proving strategy, inspired by the *tiling* approach [9], on the following example.

*Example 3.* Recall the CHC system from Fig. 2. Consider a progress candidate <sup>∀</sup>j.i< j <N <sup>=</sup><sup>⇒</sup> <sup>m</sup> <sup>≤</sup> <sup>A</sup>[j] for *inv*1. Checking its initiation (i.e., for CHC **<sup>A</sup>**) requires deciding validity of the following quantified formula:

$$i'=N'-1\wedge m'=0 \implies \left(\forall j.\,i'$$

The range formula i <j<N simplifies to <sup>N</sup> <sup>−</sup> <sup>1</sup> <j<N , which is always false, making formula (3) always valid.

<sup>2</sup> Thus, we explicitly require guards of loops to have the forms of an inequality, which is the most common array access pattern.

Checking the inductiveness of the candidate (i.e., for CHC **B**) boils down to solving a more complicated formula:

$$\begin{aligned} \left( \forall j \,. \, i < j < N \implies m \le A[j] \right) \\ \land i \ge 0 \land m' = i &te(m > A[i], A[i], m) \land i' = i - 1 \implies \\ \left( \forall j \,. \, i' < j < N \implies m' \le A[j] \right) \end{aligned} \tag{4}$$

Although quantifiers are present on both sides of (4), proving its validity is not hard. Indeed, the query is reducible to two implications:

$$\left(\forall j \;. \; i < j < N \implies m \le A[j]\right) \land m' = i \\
te(m > A[i], A[i], m) \implies m' \le A[i]$$

$$\begin{aligned} \left( \forall j \, . i < j < N \implies m \le A[j] \right) \land\\ m' = ite(m > A[i], A[i], m) \implies \left( \forall j \, . i < j < N \implies m' \le A[j] \right) \end{aligned}$$

The former does not require any information about <sup>A</sup>[i+ 1],...,A[<sup>N</sup> <sup>−</sup>1], so the entire quantified conjunction is ignored, and A[i] could be replaced by a fresh integer variable. The latter is trickier: it requires to prove that if all elements in a range are greater or equal than m, then they are also greater or equal to ite(m>A[i], A[i], m). This again is reduced to a quantifier-free formula over integer arithmetic:

$$m \le A[j] \land m' = i \\
te(m > A[i], A[i], m) \implies m' \le A[j]$$

Thus, because formulas (3) and (4) are valid, the progress candidate is proved a progress lemma.

In general, we cannot always conduct proofs that easily. Often, the prerequisite for success is the commonality of an access function f in the candidate and the body of the CHC. Fortunately, our algorithm ensures that all access functions used in the candidates are borrowed directly from bodies of CHCs. Thus, in many cases, FreqHorn is able to check large amounts of candidates quickly.

#### **4.3 Strategy of Lemma Propagation**

In this subsection, we identify a useful strategy for propagation of quantified lemmas through adjacent CHCs in the given system, inspired by [17]. Let some *inv*<sup>1</sup> ∈ *R* have the following lemma:

$$\forall \vec{q} . \rho(\vec{q}) \implies \ell$$

where q <sup>=</sup> *QVars*(*inv*1), formula <sup>ρ</sup> over q <sup>∪</sup> *IntVars*(*inv*1) is either a range or a progress range, and is over q <sup>∪</sup> *Vars*(*inv*1). Let then a CHC <sup>C</sup> be such that *rel*(*src*(C)) = *inv*<sup>1</sup> and *rel*(*dst*(C)) = *inv*2, and its body be ϕ(x1, x2).

**Definition 5.** Forward propagation *of lemma* <sup>∀</sup>q.ρ(q) =<sup>⇒</sup> *through* <sup>C</sup> *gives a formula of the following form:*

<sup>∀</sup>q .(∃x<sup>1</sup> . ρ(q)(x1) <sup>∧</sup> <sup>ϕ</sup>(x1, x2)) =<sup>⇒</sup> (∃x1(x1, q). <sup>∧</sup> <sup>ϕ</sup>(x1, x2))

*Example 4.* Recall the example from Fig. 2 and the following lemma for *inv*1:

$$\forall j \, . i < j < N \implies m \le A[j]$$

The body of **<sup>C</sup>** is i < <sup>0</sup>∧<sup>i</sup> = 0, thus the forward propagation gives the following formula:

$$\forall j. \left(\exists i. \, i < j < N \land i < 0 \land i' = 0\right) \implies \left(\exists i. \, m \le A[j] \land i < 0 \land i' = 0\right)$$

Applying quantifier elimination to both sides of the implication, we get the following formula:

$$\forall j \; . \; 0 \le j < N \implies m \le A[j].$$

Note that this formula is not going to be immediately learned as a lemma, but instead should be checked by the solver for inductiveness. Intuitively, such a candidate represents some facts about array elements that were accessed during a loop that has terminated. If after the propagation it appeared that the candidate uses the entire range then we refer to such candidate to as a *finalized* candidate.

#### **4.4 Weakening Strategy**

Whenever a finalized candidate cannot be proven inductive, we often do not want to withdraw it completely. Instead, our algorithm runs *weakening* and proposes *regress candidates*. The main idea is to calculate a range of elements which have not been touched by the loop yet. This is an inverse of the procedure outlined in Sect. 4.1.

**Definition 6.** *Given inv* ∈ *R , its Range*(*inv*) *and progressRange*(*inv*) *formulas, we call a* regress range *a formula of the following kind:*

$$\operatorname{regressRange}(\mathsf{in}\mathsf{v}) \stackrel{\mathsf{dot}}{=} \operatorname{Range}(\mathsf{in}\mathsf{v}) \land \neg \operatorname{progesRange}(\mathsf{in}\mathsf{v})$$

We call candidates that use regress ranges in their left sides as *regress candidates*. Clearly, a regress candidate is weaker than the corresponding finalized candidate. Thus, from the failure to prove inductiveness of the finalized candidate it does not follow that the regress candidate is not inductive; and it makes sense to try proving it in the next iteration.

#### **4.5 Learning from Sub-ranges**

In complicated scenarios of loops with multiple iterators, multiple array variables or multiple access functions, the iterative process of lemma discovery, might end up in a large number of quantified formulas and get lost while checking a candidate for inductiveness (recall Sect. 4.2). To overcome current limitations in existing SMT solvers, it appeared to be useful to help the solver while generalizing learned lemmas. In particular, a property could be learned for two subranges of an array, and then combined in the following way:

```
int N = nondetInt ();
int *A = nondetArray (2* N);
int val1 = 1, val2 = 3, m = nondetInt ();
for (int i = 1; i ≤ N; i++) {
  if (m < val2 ) A [2*i -2] = val2 ; else A [2*i -2] = 0;
  if (m < val1 ) A [2*i -1] = val1 ; else A [2*i -1] = 0; }
for (int i = 0; i < 2*N; i ++) assert (A[i ]==0 || A[i] ≤ m);
```
**Fig. 3.** Learning from sub-ranges.

**Lemma 1.** *Let for some inv* ∈ *R two lemmas be of the following kind:*

$$
\forall \vec{q} . \rho\_1(\vec{q}) \implies \ell \tag{5}
\tag{5}
\qquad \qquad \forall \vec{q} . \rho\_2(\vec{q}) \implies \ell \tag{5}
$$

*Then, the following is also a lemma for inv:*

$$\forall \vec{q} . \rho\_1(\vec{q}) \lor \rho\_2(\vec{q}) \implies \ell$$

*Example 5.* Figure 3 shows a program from the tiling benchmark suite [9]. If lemmas <sup>∀</sup>j . <sup>0</sup> <j<N <sup>=</sup><sup>⇒</sup> <sup>A</sup>[2 <sup>∗</sup> <sup>j</sup> <sup>−</sup> 1] = 0 <sup>∨</sup> <sup>A</sup>[2 <sup>∗</sup> <sup>j</sup> <sup>−</sup> 1] <sup>≤</sup> <sup>m</sup> and <sup>∀</sup>j . <sup>0</sup> <sup>&</sup>lt; j<N <sup>=</sup><sup>⇒</sup> <sup>A</sup>[2 <sup>∗</sup> <sup>j</sup> <sup>−</sup> 2] = 0 <sup>∨</sup> <sup>A</sup>[2 <sup>∗</sup> <sup>j</sup> <sup>−</sup> 2] <sup>≤</sup> <sup>m</sup> are discovered, then formula <sup>∀</sup>j . <sup>0</sup> <sup>≤</sup> j < <sup>2</sup> <sup>∗</sup> <sup>N</sup> <sup>−</sup> 1 =<sup>⇒</sup> <sup>A</sup>[j]=0 <sup>∨</sup> <sup>A</sup>[j] <sup>≤</sup> <sup>m</sup> is also a lemma.

#### **5 Evaluation**

We have implemented our algorithm on top of the FreqHorn<sup>3</sup> tool. It takes a system of CHCs with arrays as input and performs an enumerative search as presented in Sect. 4. The tool uses Z3 [12] to solve SMT queries.

We have evaluated FreqHorn on 137 satisfiable CHC-translations of publicly available C programs (whose assertions are safe) taken from the SVCOMP ReachSafety Array subcategory and literature. These programs include variations of standard array copying, initializing, maximum, minimum, sorting, and tiling benchmarks. Among these 137 benchmarks, 79 have a single loop, and 58 have multiple loops, including 7 that have nested loops. These programs are encoded using the theories of Arrays, Linear (LIA) and Non-linear Integer Arithmetic (NIA). Our experiments have been performed on an Ubuntu 18.04 machine running at 2.5 GHz and having 16 GB memory, with a timeout of 100 s for every benchmark. FreqHorn solved 129 benchmarks within the timeout, of which 73 solved benchmarks had a single loop and 56 had multiple loops.

We have compared our tool with Spacer (Z3 v4.8.3) [26], that implements a recent QUIC3 [22] algorithm, Booster (v0.2) [2], VIAP (v1.0) [35], and Veri-Abs (v1.3.10) [11]. The last two tools performed well in the ReachSafety Array

<sup>3</sup> The source code and benchmarks are available at https://github.com/grigoryfedyuk ovich/aeval/tree/rnd.

**Fig. 4.** FreqHorn vs competitors. Each point in a plot represents a pair of the run times (sec <sup>×</sup> sec) of FreqHorn (x-axis) and a competitor (y-axis). Timeouts are placed on the inner dashed lines; false alarms, unsupported cases, and crashes are on the outer dashed lines.

subcategory at SVCOMP 2019<sup>4</sup>. Figure 4 gives a comparison of FreqHorn timings against timings of these tools.<sup>5</sup>

Compare to 129 benchmarks solved by FreqHorn, only 81 were solved by Spacer, 108 – by VeriAbs, 70 – by VIAP, and 48 – by Booster.

FreqHorn solved 54 benchmarks on which Spacer diverged. Our intuition is that Spacer works poorly on programs with non-deterministic assignments and NIA operations, which our tool can handle.

FreqHorn solved 27 benchmarks on which VeriAbs diverged. VeriAbs failed to solve programs with nested loops and when array values were dependent on access indices. Furthermore, it decided one of the programs as unsafe, Time-wise, FreqHorn significantly outperformed VeriAbs on all benchmarks.

<sup>4</sup> https://sv-comp.sosy-lab.org/2019/results/results-verified/.

<sup>5</sup> The time taken for every benchmark is available at: http://bit.ly/2VS5Mtf.

Importantly, the short time taken by FreqHorn includes the time for generating a checkable witness – quantified invariant – an essence that VeriAbs cannot produce by design. On the other side, VeriAbs solved several benchmarks after merging loops. No quantified invariant satisfying the FreqHorn's restrictions exists for these benchmarks before this program transformation.

FreqHorn solved 60 programs on which VIAP diverged. VIAP decided one program as unsafe. There were no programs on which FreqHorn took more time than VIAP. Finally, FreqHorn solved 83 programs on which Booster diverged. And again, Booster decided two programs as unsafe.

#### **6 Related Work**

Our algorithm for quantified invariant synthesis extends the prior work on checking satisfiability of CHCs [15–17], where solutions do not permit quantifiers. It works in a similar – enumerate-and-check – manner, but there are two crucial changes: (1) introduction of quantifiers, to formulate hypotheses over a subset of array indices, and (2) a generalization mechanism, to derive properties that may hold over the entire range of array indices.

Many existing approaches for verifying programs over arrays are extensions of well-known techniques for programs over scalar variables to quantified invariants. For example, by extending predicates with Skolem variables in predicate abstraction [30], by exploiting the MCMT [19] framework in lazy abstraction with interpolants [1] and its integration with acceleration [2], and, recently, QUIC3 [22], that extends IC3 [8,14] to universally quantified invariants. Apart from the skeletal similarity, however, these approaches rely on orthogonal techniques.

Partitioning of arrays has also been used to infer invariants in many different ways. It refers to splitting an array into symbolic segments, and may be based on syntax [20,23,25] or semantics [10,31]. Invariants may be inferred for each segment separately and generalized for the entire array. The partitioning need not be explicit, as in [13]. However, most of these techniques (except [13,31]) are restricted to contiguous array segments, and work well when different loop iterations write to disjoint array locations or when the segments are non-overlapping. Tiling [9], a property-driven verification technique, overcomes these limitations for a class of programs by inferring array access patterns in loops. But identifying tiles of array accesses is itself a difficult problem, and the approach is currently based on heuristics developed by observing interesting patterns.

There are a number of approaches that verify array programs without inferring quantified invariants explicitly. A straightforward way is to smash all array elements into a single memory location [4], but it is quite imprecise. Every array element might also be considered a separate variable, but it is not possible with unknown array sizes. There are also techniques that abstract an array to a fixed number of elements, e.g. k-distinguished cell abstraction [32,33] and k-shrinkability [24,29]. Such abstractions usually reduce array modifying loops with unknown bounds to a known, small bound. It may even be possible to get rid of such loops altogether, by accelerating (computing transitive closures of) transition relations involving array updates in that loop [7]. Along similar lines, VIAP [35] resorts to reasoning with recurrences instead of loops. It translates the input program, including loops, to a set of first-order axioms, and checks if they derive the property. But all these techniques do not obtain quantified invariants explicitly, unlike ours. Besides, many of these transformations produce an abstraction of the original program, i.e., they do not preserve safety.

Alternatively, there are approaches that use sufficiently expressive templates to infer quantified invariants over arrays [5,21,27]. However, the templates need to be supplied manually. For instance, [6] uses a template space of quantified invariants and reduces the problem to quantifier-free invariant generation. Thus, universally quantified solutions for unknown predicates in a CHC system may be obtained by extending a generic CHC solver to handle quantified predicates. Learning need not be limited to user-supplied templates; one may do away with the templates entirely and learn only from examples and counterexamples [18]. Alternatively, [36] chooses a template upfront and refurbishes it with constants or coefficients appearing in the program source. Similarly, [28] proposes to infer array invariants without any user guidance or any user-defined templates or predicates. Their method is based on automatic analysis of predicates that update an array and allows one to generate first-order invariants, including those that contain alternations of quantifiers. But it does not work for nested loops. By comparison, our technique supports multiple as well as nested loops, enables candidate propagation between loops and, more importantly, generates the grammar automatically from the syntactical constructions appearing in the program's source.

#### **7 Conclusion**

We have presented a new algorithm to synthesize quantified invariants over array variables, systematically accessed in loops. Our algorithm implements an enumerative search that guesses invariants based on syntactic constructions which appear in the code and checks their initiation, inductiveness, and safety with an off-the-shelf SMT solver. Key insights behind our approach are that individual accesses to array elements performed in the loop can be generalized to hypotheses about entire ranges, and the existing SMT solvers can be used to validate these hypotheses efficiently. Our implementation on top of a CHC solver FreqHorn confirmed that such strategy is effective on a variety of practical examples. In a vast majority of cases, our tool outperformed competitors and provided checkable guarantees that prevented from reporting false positives.

**Acknowledgements.** This work was supported in part by NSF Grant 1525936. Any opinions, findings, and conclusions expressed herein are those of the authors and do not necessarily reflect those of the NSF.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Efficient Synthesis with Probabilistic Constraints**

Samuel Drews(B) , Aws Albarghouthi, and Loris D'Antoni

> University of Wisconsin-Madison, Madison, USA sedrews@wisc.edu

**Abstract.** We consider the problem of synthesizing a program given a probabilistic specification of its desired behavior. Specifically, we study the recent paradigm of *distribution-guided inductive synthesis* (digits), which iteratively calls a synthesizer on finite sample sets from a given distribution. We make theoretical and algorithmic contributions: (*i*) We prove the surprising result that digits only requires a polynomial number of synthesizer calls in the size of the sample set, despite its ostensibly exponential behavior. (*ii*) We present a property-directed version of digits that further reduces the number of synthesizer calls, drastically improving synthesis performance on a range of benchmarks.

#### **1 Introduction**

Over the past few years, progress in automatic program synthesis has touched many application domains, including automating data wrangling and data extraction tasks [2,13,15,21,22,30], generating network configurations that meet user intents [10,29], optimizing low-level code [25,28], and more [4,14].

The majority of the current work has focused on synthesis under Boolean constraints. However, often times we require the program to adhere to a probabilistic specification, e.g., a controller that succeeds with a high probability, a decision-making model operating over a probabilistic population model, a randomized algorithm ensuring privacy, etc. In this work, we are interested in (1) investigating probabilistic synthesis from a theoretical perspective and (2) developing efficient algorithmic techniques to tackle this problem.

Our starting point is our recent framework for probabilistic synthesis called *distribution-guided inductive synthesis* (digits) [1]. The digits framework is analogous in nature to the *guess-and-check* loop popularized by counterexampleguided approaches to synthesis and verification (cegis and cegar). The key idea of the algorithm is reducing the probabilistic synthesis problem to a nonprobabilistic one that can be solved using existing techniques, e.g., sat solvers. This is performed using the following loop: (1) approximating the input probability distribution with a finite sample set; (2) synthesizing a program for various possible output assignments of the finite sample set; and (3) invoking a probabilistic verifier to check if one of the synthesized programs indeed adheres to the given specification.

digits has been shown to theoretically converge to correct programs when they exist—thanks to learning-theory guarantees. The primary bottleneck of digits is the number of expensive calls to the synthesizer, which is ostensibly exponential in the size of the sample set. Motivated by this observation, this paper makes theoretical, algorithmic, and practical contributions:


#### **2 An Overview of DIGITS**

In this section, we present the synthesis problem, the digits [1] algorithm, and fundamental background on learning theory.

#### **2.1 Probabilistic Synthesis Problem**

**Program Model.** As discussed in [1], digits searches through some (infinite) set of programs, but it requires that the set of programs has *finite VC dimension* (we restate this condition in Sect. 2.3). Here we describe one constructive way of obtaining such sets of programs with finite VC dimension: we will consider sets of programs defined as *program sketches* [27] in the simple grammar from [1], where a program is written in a loop-free language, and "holes" defining the sketch replace some constant terminals in expressions.<sup>1</sup> The syntax of the language is defined below:

$$P \coloneqq \begin{array}{c} V \leftarrow E \mid \text{if} \mid B \text{ then } P \text{ \texttt{else} } P \mid P \mid P \mid \text{return } V \end{array}$$

Here, P is a program, V is the set of variables appearing in P, E (resp. B) is the set of linear arithmetic (resp. Boolean) expressions over V (where, again, constants in E and B can be replaced with holes), and V ← E is an assignment. We assume a vector *v*<sup>I</sup> of variables in V that are inputs to the program. We

<sup>1</sup> In the case of loop-free program sketches as considered in our program model, we can convert the input-output relation into a real arithmetic formula that guaranteedly has finite VC dimension [12].

also assume there is a single Boolean variable v<sup>r</sup> ∈ V that is returned by the program.<sup>2</sup> All variables are real-valued or Boolean. Given a vector of constant values *c*, where |*c*| = |*v*<sup>I</sup> |, we use P(*c*) to denote the result of executing P on the input *c*.

In our setting, the inputs to a program are distributed according to some *joint probability distribution* D over the variables *v*<sup>I</sup> . Semantically, a program P is denoted by a *distribution transformer* -P, whose input is a distribution over values of *v*<sup>I</sup> and whose output is a distribution over *v*<sup>I</sup> and vr.

A program also has a *probabilistic postcondition*, *post*, defined as an inequality over terms of the form Pr[B], where B is a Boolean expression over *v*<sup>I</sup> and vr. Specifically, a probabilistic postcondition consists of Boolean combinations of the form e>c, where <sup>c</sup> <sup>∈</sup> <sup>R</sup> and <sup>e</sup> is an arithmetic expression over terms of the form Pr[B], e.g., Pr[B1]/Pr[B2] > 0.75.

Given a triple (P, D, *post*), we say that P is *correct* with respect to D and *post*, denoted -<sup>P</sup>(D) <sup>|</sup><sup>=</sup> *post*, *iff post* is true on the distribution -P(D).

*Example 1.* Consider the set of intervals of the form [0, a] ⊆ [0, 1] and inputs x uniformly distributed over [0, 1] (i.e. D = Uniform[0, 1]). We can write inclusion in the interval as a (C-style) program (left) and consider a postcondition stating that the interval must include at least half the input probability mass (right):

Let P<sup>c</sup> denote the interval program where a is replaced by a constant c ∈ [0, 1]. Observe that -Pc(D) describes a joint distribution over (x, vr) pairs, where [0, c]×{1} is assigned probability measure c and (c, 1]×{0} is assigned probability measure 1 − c. Therefore, -<sup>P</sup>c(D) <sup>|</sup><sup>=</sup> *post* if and only if <sup>c</sup> <sup>∈</sup> [0.5, 1].

**Synthesis Problem.** digits outputs a program that is approximately "similar" to a given functional specification and that meets a postcondition. This functional specification is some input-output relation which we quantitatively want to match as closely as possible: specifically, we want to minimize the *error* of the output program P from the functional specification Pˆ, defined as Er(P) := Pr<sup>x</sup>∼<sup>D</sup>[P(x) <sup>=</sup> <sup>P</sup>ˆ(x)]. (Note that we represent the functional specification as a program.) The postcondition is Boolean, and therefore we always want it to be true. digits is guaranteed to converge whenever the space of solutions satisfying the postcondition is *robust* under small perturbations. The following definition captures this notion of robustness:

**Definition 1 (**α**-Robust Programs).** *Fix an input distribution* D*, a postcondition post, and a set of programs* P*. For any* P ∈ P *and any* α > 0*, denote the*

<sup>2</sup> Restricting the output to Boolean is required by the algorithm; other output types can be turned into Boolean by rewriting. See, e.g., thermostat example in Sect. 5.

open α-ball centered at P *as* Bα(P) = {P ∈P| Prx∼D[P(x) = P (x)] < α}*. We say a program* P *is* α-robust *if* ∀P ∈ Bα(P). -P (D) <sup>|</sup><sup>=</sup> *post.*

We can now state the synthesis problem solved by digits:

**Definition 2 (Synthesis Problem).** *Given an input distribution* D*, a set of programs* <sup>P</sup>*, a postcondition post, a functional specification* <sup>P</sup><sup>ˆ</sup> ∈ P*, and parameters* α > 0 *and* 0 < ε α*, the synthesis problem is to find a program* P ∈ P *such that* -<sup>P</sup>(D) <sup>|</sup><sup>=</sup> *post and such that any other* <sup>α</sup>*-robust* <sup>P</sup> *has Er*(P) - *Er*(P )+ε*.*

#### **2.2 A Naive DIGITS Algorithm**

Algorithm <sup>1</sup> shows a simplified, naive version of digits, which employs a *synthesize-then-verify* approach. The idea of digits is to utilize non-probabilistic synthesis techniques to synthesize a set of programs, and then apply a probabilistic verification step to check if any of the synthesized programs is a solution.

Specifically, this "Naive digits" begins by sampling an appropriate number of inputs from the input distribution and stores them in the set S. Second, it iteratively explores each possible function f that maps the input samples to a Boolean and invokes a synthesis oracle to synthesize a program P that implements f, i.e. that satisfies the set of input– output examples in which each input x ∈ S is mapped to the output f(x). Naive digits then finds which of the

```
1 Procedure digits (P , ˆ D, post, m)
2 S ← {x ∼ D | i ∈ [1,...,m]}
3 progs ← ∅
4 foreach f : S → {0, 1} do
5 P ← Osyn({(x, f(x)) | x ∈ S})
6 if P = ⊥ then
7 progs ← progs ∪ {P}
8 res ← {P ∈ progs |
     Over(P, D, post)}
9 return argminP ∈res{Oerr(P)}
```
### **Algorithm 1:** Naive digits

synthesized programs satisfy the postcondition (the set *res*); we assume that we have access to a probabilistic verifier Over to perform these computations. Finally, the algorithm outputs the program in the set *res* that has the lowest error with respect to the functional specification, once again assuming access to another oracle Oerr that can measure the error.

Note that the number of such functions f : S → {0, 1} is exponential in the size of <sup>|</sup>S|. As a "heuristic" to improve performance, the actual digits algorithm as presented in [1] employs an incremental trie-based search, which we describe (alongside our new algorithm, <sup>τ</sup> -digits) and analyze in Sect. 3. The naive version described here is, however, sufficient to discuss the convergence properties of the full algorithm.

#### **2.3 Convergence Guarantees**

digits is only guaranteed to converge when the program model <sup>P</sup> has *finite VC dimension*. <sup>3</sup> Intuitively, the VC dimension captures the expressiveness of the set

<sup>3</sup> Recall that this is largely a "free" assumption since, again, sketches in our loop-free grammar guaranteedly have finite VC dimension.

of ({0, 1}-valued) programs P. Given a set of inputs S, we say that P *shatters* S iff, for every partition of S into sets S<sup>0</sup> S1, there exists a program P ∈ P such that (*i*) for every x ∈ S0, P(x) = 0, and (*ii*) for every x ∈ S1, P(x) = 1.

**Definition 3 (VC Dimension).** *The* VC dimension *of a set of programs* P *is the largest integer* d *such that there exists a set of inputs* S *with cardinality* d *that is shattered by* P*.*

We define the function VCcost(ε, δ, d) = <sup>1</sup> <sup>ε</sup> (4 log2( <sup>2</sup> <sup>δ</sup> )+8d log2( <sup>13</sup> <sup>ε</sup> )) [5], which is used in the following theorem:

**Theorem 1 (Convergence).** *Assume that there exist an* α > 0 *and program* P<sup>∗</sup> *that is* α*-robust w.r.t.* D *and post. Let* d *be the VC dimension of the set of programs* P*. For all bounds* 0 < ε α *and* δ > 0*, for every function* Osyn*, and for any* <sup>m</sup> VCcost(ε, δ, k)*, with probability* <sup>1</sup> <sup>−</sup> <sup>δ</sup> *we have that* digits *enumerates a program* P *with* Pr<sup>x</sup>∼<sup>D</sup>[P∗(x) = P(x)] ε *and* -<sup>P</sup>(D) <sup>|</sup><sup>=</sup> *post.*

To reiterate, suppose P<sup>∗</sup> is a correct program with small error Er(P∗) = k; the convergence result follows two main points: (*i*) P<sup>∗</sup> must be α*-robust*, meaning every P with Pr<sup>x</sup>∼<sup>D</sup>[P(x) = P∗(x)] < α must also be correct, and therefore (*ii*) by synthesizing *any* P such that Pr<sup>x</sup>∼<sup>D</sup>[P(x) = P∗(x)] ε where ε<α, then P is a correct program with error Er(P) within k ± ε.

#### **2.4 Understanding Convergence**

The importance of finite VC dimension is due to the fact that the convergence statement borrows directly from *probably approximately correct (PAC) learning*. We will briefly discuss a core detail of efficient PAC learning that is relevant to understanding the convergence of digits (and, in turn, our analysis of <sup>τ</sup> -digits in Sect. 4), and refer the interested reader to Kearns and Vazirani's book [16] for a complete overview. Specifically, we consider the notion of an ε*-net*, which establishes the approximate-definability of a target program in terms of points in its input space.

**Definition 4 (**ε**-net).** *Suppose* P ∈ P *is a target program, and points in its input domain* <sup>X</sup> *are distributed* <sup>x</sup> <sup>∼</sup> <sup>D</sup>*. For a fixed* <sup>ε</sup> <sup>∈</sup> [0, 1]*, we say a set of points* <sup>S</sup> ⊂ X *is an* <sup>ε</sup>-net *for* <sup>P</sup> *(with respect to* <sup>P</sup> *and* <sup>D</sup>*) if for every* <sup>P</sup> ∈ P *with* Pr<sup>x</sup>∼<sup>D</sup>[P(x) = P (x)] > ε *there exists a witness* x ∈ S *such that* P(x) = P (x)*.*

In other words, if S is an ε-net for P, and if P "agrees" with P on all of S, then P and P can only differ by at most ε probability mass.

Observe the relevance of <sup>ε</sup>-nets to the convergence of digits: the synthesis oracle is guaranteed not to "fail" by producing only programs ε-far from some ε-robust P<sup>∗</sup> if the sample set happens to be an ε-net for P∗. In fact, this observation is exactly the core of the PAC learning argument: having an ε-net exactly guarantees the approximate learnability.

A remarkable result of computational learning theory is that whenever P has finite VC dimension, the probability that m random samples fail to yield an ε-net becomes diminishingly small as <sup>m</sup> increases. Indeed, the given VCcost function used in Theorem 1 is a dual form of this latter result—that polynomially many samples are sufficient to form an ε-net with high probability.

#### **3 The Efficiency of Trie-Based Search**

After providing details on the search strategy employed by digits, we present our theoretical result on the polynomial bound on the number of synthesis queries that digits requires.

#### **3.1 The Trie-Based Search Strategy of DIGITS**

Naive digits, as presented in Algorithm 1, performs a very unstructured, exponential search over the output labelings of the sampled inputs—i.e., the possible Boolean functions f in Algorithm 1. In our original paper [1] we present a "heuristic" implementation strategy that incrementally explores the set of possible output labelings using a trie data structure. In this section, we study the complexity of this technique through the lens of computational learning theory and discover the surprising result that digits requires a polynomial number of calls to the synthesizer in the size of the sample set! Our improved search algorithm (Sect. 4) inherits these results.

For the remainder of this paper, we use digits to refer to this incremental version. A full description is necessary for our analysis: Fig. 1 (non-framed rules only) consists of a collection of guarded rules describing the construction of the trie used by digits to incrementally explore the set of possible output labelings. Our improved version, <sup>τ</sup> -digits (presented in Sect. 4), corresponds to the addition of the framed parts, but without them, the rules describe digits.

Nodes in the trie represent partial output labelings—i.e., functions f assigning Boolean values to only some of the samples in S = {x1,...,x<sup>m</sup>}. Each node is identified by a binary string σ = b<sup>1</sup> ··· b<sup>k</sup> (k can be smaller than m) denoting the path to the node from the root. The string σ also describes the partial output-labeling function f corresponding to the node—i.e., if the i-th bit b<sup>i</sup> is set to 1, then f(xi) = true. The set *explored* represents the nodes in the trie built thus far; for each new node, the algorithm synthesizes a program consistent with the corresponding partial output function ("Explore" rules). The variable *depth* controls the incremental aspect of the search and represents the maximum length of any σ in *explored*; it is incremented whenever all nodes up to that depth have been explored (the "Deepen" rule). The crucial part of the algorithm is that, if no program can be synthesized for the partial output function of a node identified by σ, the algorithm does not need to issue further synthesis queries for the descendants of σ.

Figure <sup>2</sup> shows how digits builds a trie for an example run on the interval programs from Example 1, where we suppose we begin with an incorrect program describing the interval [0, 0.3]. Initially, we set the root program to [0, 0.3] (left


**Fig. 1.** Full digits description and our new extension, <sup>τ</sup> -digits, shown in boxes.

figure). The "Deepen" rule applies, so a sample is added to the set of samples suppose it's 0.4. "Explore" rules are then applied twice to build the children of the root: the child following the 0 branch needs to map 0.4 → 0, which [0, 0.3] already does, thus it is propagated to that child without asking Osyn to perform a synthesis query. For the child following 1, we instead make a synthesis query, using the oracle Osyn, for any value of a such that [0, a] maps 0.4 → 1—suppose it returns the solution a = 1, and we associate [0, 1] with this node. At this point we have exhausted depth 1 (middle figure), so "Deepen" once again applies, perhaps adding 0.6 to the sample set. At this depth (right figure), only two calls to Osyn are made: in the case of the call at σ = 01, there is no value of a that causes both 0.4 → 0 and 0.6 → 1, so Osyn returns ⊥, and we do not try to explore any children of this node in the future. The algorithm continues in this manner until a stopping condition is reached—e.g., enough samples are enumerated.

#### **3.2 Polynomial Bound on the Number of Synthesis Queries**

We observed in [1] that the trie-based exploration seems to be efficient in practice, despite potential exponential growth of the number of explored nodes in the trie as the depth of the search increases. The convergence analysis of digits relies on the finite VC dimension of the program model, but VC dimension itself is just a summary of the *growth function*, a function that describes a notion

**Fig. 2.** Example execution of incremental digits on interval programs, starting from [0, <sup>0</sup>.3]. Hollow circles denote calls to <sup>O</sup>syn that yield new programs; the cross denotes a call to Osyn that returns ⊥.

of complexity of the set of programs in question. We will see that the growth function much more precisely describes the behavior of the trie-based search; we will then use a classic result from computational learning theory to derive better bounds on the performance of the search. We define the growth function below, adapting the presentation from [16].

**Definition 5 (Realizable Dichotomies).** *We are given a set* P *of programs representing functions from* X→{0, 1} *and a (finite) set of inputs* S ⊂ X *. We call any* f : S → {0, 1} *a* dichotomy *of* S*; if there exists a program* P ∈ P *that extends* f *to its full domain* X *, we call* f *a* realizable dichotomy *in* P*. We denote the set of realizable dichotomies as*

$$\Pi\_{\mathcal{P}}(S) := \{ f : S \to \{0, 1\} \mid \exists P \in \mathcal{P}. \forall x \in S. P(x) = f(x) \}.$$

Observe that for any (infinite) set P and any finite set S that 1 - |Π<sup>P</sup> (S)| - 2|S<sup>|</sup> . We define the growth function in terms of the realizable dichotomies:

**Definition 6 (Growth Function).** *The* growth function *is the maximal number of realizable dichotomies as a function of the number of samples, denoted*

$$\hat{\Pi}\_{\mathcal{P}}(m) := \max\_{\substack{S \subset \mathcal{X} : \\ |S| = m}} \{ |\Pi\_{\mathcal{P}}(S)| \}.$$

Observe that P has VC dimension d if and only if d is the largest integer satisfying Πˆ <sup>P</sup> (d)=2<sup>d</sup> (and infinite VC dimension when Πˆ <sup>P</sup> (m) is identically 2<sup>m</sup>)— in fact, VC dimension is often defined using this characterization.

*Example 2.* Consider the set of intervals of the form [0, a] as in Examples 1 and Fig. 2. For the set of two points S = {0.4, 0.6}, we have that |Π[0,a](S)| = 3, since, by example: a = 0.5 accepts 0.4 but not 0.6, a = 0.3 accepts neither, and a = 1 accepts both, thus these three dichotomies are realizable; however, no interval with 0 as a left endpoint can accept 0.6 and not 0.4, thus this dichotomy is not realizable. In fact, for any (finite) set S ⊂ [0, 1], we have that |Π[0,a](S)| = |S|+1; we then have that Πˆ[0,a](m) = m + 1.

When digits terminates having used a sample set <sup>S</sup>, it has considered all the dichotomies of S: the programs it has enumerated exactly correspond to extensions of the realizable dichotomies Π<sup>P</sup> (S). The trie-based exploration is effectively trying to minimize the number of Osyn queries performed on nonrealizable ones, but doing so without explicit knowledge of the full functional behavior of programs in P. In fact, it manages to stay relatively close to performing queries only on the realizable dichotomies:

**Lemma 1.** digits *performs at most* <sup>|</sup>S||Π<sup>P</sup> (S)<sup>|</sup> *synthesis oracle queries. More precisely, let* S = {x1,...,x<sup>m</sup>} *be indexed by the depth at which each sample was added: the exact number of synthesis queries is* m =1|Π<sup>P</sup> ({x1,...,x−<sup>1</sup>})|*.*

*Proof.* Let T<sup>d</sup> denote the total number of queries performed once depth d is completed. We perform no queries for the root,<sup>4</sup> thus T<sup>0</sup> = 0. Upon completing depth d − 1, the realizable dichotomies of {x1,...,x<sup>d</sup>−<sup>1</sup>} exactly specify the nodes whose children will be explored at depth d. For each such node, one child is skipped due to solution propagation, while an oracle query is performed on the other, thus T<sup>d</sup> = T<sup>d</sup>−<sup>1</sup> + |Π<sup>P</sup> ({x1,...,x<sup>d</sup>−<sup>1</sup>})|. Lastly, | Π<sup>P</sup> (S)| cannot decrease by adding elements to S, so we have that T<sup>m</sup> = m =1 - |Π<sup>P</sup> ({x1,...,x−<sup>1</sup>})| m =1|Π<sup>P</sup> (S)| -|S||Π<sup>P</sup> (S)|.

Connecting digits to the realizable dichotomies and, in turn, the growth function allows us to employ a remarkable result from computational learning theory, stating that the growth function for any set exhibits one of two asymptotic behaviors: it is either *identically* 2<sup>m</sup> (infinite VC dimension) or dominated by a polynomial! This is commonly called the Sauer-Shelah Lemma [24,26]:

**Lemma 2 (Sauer-Shelah).** *If* P *has finite VC dimension* d*, then for all* m <sup>d</sup>*,* Πˆ <sup>P</sup> (m) - em d d *; i.e.* Πˆ <sup>P</sup> (m) = <sup>O</sup>(m<sup>d</sup>)*.*

Combining our lemma with this famous one yields a surprising result—that for a fixed set of programs P with finite VC dimension, the number of oracle queries performed by digits *is guaranteedly polynomial* in the depth of the search, where the degree of the polynomial is determined by the VC dimension:

**Theorem 2.** *If* <sup>P</sup> *has VC dimension* <sup>d</sup>*, then* digits *performs* <sup>O</sup>(m<sup>d</sup>+1) *synthesis-oracle queries.*

In short, the reason an execution of digits *seems* to enumerate a subexponential number of programs (as a function of the depth of the search) is because it literally must be polynomial. Furthermore, the algorithm performs oracle queries on *nearly* only those polynomially-many realizable dichotomies.

*Example 3.* <sup>A</sup> digits run on the [0, a] programs as in Fig. <sup>2</sup> using a sample set of size m will perform O(m<sup>2</sup>) oracle queries, since the VC dimension of these intervals is 1. (In fact, every run of the algorithm on these programs will perform exactly <sup>1</sup> <sup>2</sup>m(m + 1) many queries.)

<sup>4</sup> We assume the functional specification itself is some <sup>P</sup><sup>ˆ</sup> ∈ P and thus can be used the alternative is a trivial synthesis query on an empty set of constraints.

### **4 Property-Directed** *τ* **-DIGITS**

digits has better convergence guarantees when it operates on larger sets of sampled inputs. In this section, we describe a new optimization of digits that reduces the number of synthesis queries performed by the algorithm so that it more quickly reaches higher depths in the trie, and thus allows to scale to larger samples sets. This optimized digits, called <sup>τ</sup> -digits, is shown in Fig. <sup>1</sup> as the set of all the rules of digits plus the framed elements. The high-level idea is to skip synthesis queries that are (quantifiably) unlikely to result in optimal solutions. For example, if the functional specification Pˆ maps every sampled input in S to 0, then the synthesis query on the mapping of every element of S to 1 becomes increasingly likely to result in programs that have maximal distance from Pˆ as the size of S increases; hence the algorithm could probably avoid performing that query.In the following, we make use of the concept of *Hamming distance* between pairs of programs:

**Definition 7 (Hamming Distance).** *For any finite set of inputs* S *and any two programs* P1, P2*, we denote* HammingS(P1, P2) := |{x ∈ S | P1(x) = P2(x)}| *(we will also allow any* {0, 1}*-valued string to be an argument of* HammingS*).*

#### **4.1 Algorithm Description**

Fix the given functional specification Pˆ and suppose that there exists an ε-robust solution <sup>P</sup><sup>∗</sup> with (nearly) minimal error <sup>k</sup> = Er(P∗) := Pr<sup>x</sup>∼<sup>D</sup>[Pˆ(x) <sup>=</sup> <sup>P</sup>∗(x)]; we would be happy to find *any* program P in P∗'s ε-ball. Suppose we angelically know k a priori, and we thus restrict our search (for each depth m) only to constraint strings (i.e. σ in Fig. 1) that have Hamming distance not much larger than km.

To be specific, we first fix some threshold τ ∈ (k, 1]. Intuitively, the optimization corresponds to modifying digits to consider only paths <sup>σ</sup> through the trie such that HammingS(P,σ ˆ ) τ |S|. This is performed using the *unblocked* function in Fig. 1. Since we are ignoring certain paths through the trie, we need to ask: *How much does this decrease the probability of the algorithm succeeding?*— It depends on the tightness of the threshold, which we address in Sect. 4.2. In Sect. 4.3, we discuss how to adaptively modify the threshold <sup>τ</sup> as <sup>τ</sup> -digits is executing, which is useful when a good τ is unknown a priori.

#### **4.2 Analyzing Failure Probability with Thresholding**

Using <sup>τ</sup> -digits, the choice of <sup>τ</sup> will affect both (*i*) how many synthesis queries are performed, and (*ii*) the likelihood that we *miss* optimal solutions; in this section we explore the latter point.<sup>5</sup> Interestingly, we will see that all of the analysis is dependent only on parameters directly related to the threshold; notably, none of this analysis is dependent on the complexity of P (i.e. its VC dimension).

<sup>5</sup> The former point is a difficult combinatorial question that to our knowledge has no precedent in the computational learning literature, and so we leave it as future work.

If we really want to learn (something close to) a program P∗, then we should use a value of the threshold <sup>τ</sup> such that PrS∼Dm[HammingS(P,P <sup>ˆ</sup> <sup>∗</sup>) τm] is large—to do so requires knowledge of the distribution of HammingS(P,P ˆ <sup>∗</sup>). Recall the *binomial distribution*: for parameters (n, p), it describes the number of successes in n-many trials of an experiment that has success probability p.

*Claim.* Fix <sup>P</sup> and let <sup>k</sup> = Prx∼D[Pˆ(x) <sup>=</sup> <sup>P</sup>(x)]. If <sup>S</sup> is sampled from <sup>D</sup>m, then HammingS(P,P ˆ ) is binomially distributed with parameters (m, k).

Next, we will use our knowledge of this distribution to reason about the *failure probability*, i.e. that <sup>τ</sup> -digits does not preserve the convergence result of digits.

The simplest argument we can make is a union-bound style argument: the thresholded algorithm can "fail" by (*i*) failing to sample an ε-net, or otherwise (*ii*) sampling a set on which the optimal solution has a Hamming distance that is not representative of its actual distance. We provide the quantification of this failure probability in the following theorem:

**Theorem 3.** *Let* P<sup>∗</sup> *be a target* ε*-robust program with* <sup>k</sup> = Pr<sup>x</sup>∼<sup>D</sup>[Pˆ(x) <sup>=</sup> <sup>P</sup>∗(x)]*, and let* δ *be the probability that* m *samples do not form an* ε*-net for* P∗*. If we run the* <sup>τ</sup> *-*digits *with* <sup>τ</sup> <sup>∈</sup> (k, 1]*, then the failure probability is at most* <sup>δ</sup>+Pr[X > τm] *where* X ∼ Binomial(m, k)*.*

In other words, we can use tail probabilities of the binomial distribution to bound the probability that the threshold causes us to "miss" a desirable program we otherwise would have enumerated. Explicitly, we have the following corollary:

**Corollary 1.** <sup>τ</sup> *-*digits *increases failure probability (relative to* digits*) by at most* Pr[X > τm] = m <sup>i</sup>=τm+1 <sup>m</sup> i ki (1 <sup>−</sup> <sup>k</sup>)<sup>m</sup>−<sup>i</sup> .

Informally, when m is *not too small*, k is *not too large*, and τ is *reasonably forgiving*, these tail probabilities can be quite small. We can even analyze the asymptotic behavior by using any existing upper bounds on the binomial distribution's tail probabilities—importantly, the additional error diminishes exponentially as m increases, dependent on the size of τ relative to k.

**Corollary 2.** <sup>τ</sup> *-*digits *increases failure probability by at most* <sup>e</sup>−2m(τ−k)<sup>2</sup> *.* 6

*Example 4.* Suppose m = 100, k = 0.1, and τ = 0.2. Then the extra failure probability term in Theorem 3 is less than 0.001.

As stated at the beginning of this subsection, the balancing act is to choose τ (*i*) small enough so that the algorithm is still fast for large m, yet (*ii*) large enough so that the algorithm is still likely to learn the desired programs. The further challenge is to relax our initial strong assumption that we know the optimal k a priori when determining τ , which we address in the following subsection.

<sup>6</sup> A more precise (though less convenient) bound is e <sup>−</sup>m(<sup>τ</sup> ln <sup>τ</sup> <sup>k</sup> +(1−τ) ln <sup>1</sup>*−*<sup>τ</sup> <sup>1</sup>*−*<sup>k</sup> ) .

#### **4.3 Adaptive Threshold**

Of course, we do not have the angelic knowledge that lets us pick an ideal threshold τ ; the only absolutely sound choice we can make is the trivial τ = 1. Fortunately, we can begin with this choice of τ and *adaptively* refine it as the search progresses. Specifically, every time we encounter a correct program P such that k = Er(P), we can refine τ to reflect our newfound knowledge that "the best solution has distance of at most k."

We refer to this refinement as *adaptive* <sup>τ</sup> -digits. The modification involves the addition of the following rule to Fig. 1:

$$\frac{best \ne \bot}{\tau \gets g(\mathcal{O}\_{\text{err}}(best))} \text{Refine Threshdold (for some } g: [0, 1] \to [0, 1])$$

We can use any (non-decreasing) function g to update the threshold τ ← g(k). The simplest choice would be the identity function (which we use in our experiments), although one could use a looser function so as not to over-prune the search. If we choose functions of the form g(k) = k + b, then Corollary 2 allows us to make (slightly weak) claims of the following form:

*Claim.* Suppose the adaptive algorithm completes a search of up to depth m yielding a best solution with error k (so we have the final threshold value τ = k + b). Suppose also that P<sup>∗</sup> is an optimal ε-robust program at distance k − η. The optimization-added failure probability (as in Corollary 1) for a run of (nonadaptive) <sup>τ</sup> -digits completing depth <sup>m</sup> and using this <sup>τ</sup> is at most <sup>e</sup>−2m(b+η)<sup>2</sup> .

#### **5 Evaluation**

**Implementation.** In this section, we evaluate our new algorithm <sup>τ</sup> -digits (Fig. 1) and its adaptive variant (Sect. 4.3) against digits (i.e., <sup>τ</sup> -digits with τ = 1). Both algorithms are implemented in Python and use the SMT solver Z3 [8] to implement a sketch-based synthesizer Osyn. We employ statistical verification for Over and Oerr: we use Hoeffding's inequality for estimating probabilities in *post* and Er. Probabilities are computed with 95% confidence, leaving our oracles potentially unsound.

**Research Questions.** Our evaluation aims to answer the following questions:

**RQ1** Is adaptive <sup>τ</sup> -digits more effective/precise than <sup>τ</sup> -digits? **RQ2** Is <sup>τ</sup> -digits more effective/precise than digits? **RQ3** Can <sup>τ</sup> -digits solve challenging synthesis problems?

We experiment on three sets of benchmarks: (*i*) synthetic examples for which the optimal solutions can be computed analytically (Sect. 5.1), (*ii*) the set of benchmarks considered in the original digits paper (Sect. 5.2), (*iii*) a variant of the thermostat-controller synthesis problem presented in [7] (Sect. 5.3).

#### **5.1 Synthetic Benchmarks**

We consider a class of synthetic programs for which we can compute the optimal solution exactly; this lets us compare the results of our implementation to an ideal baseline. Here, the program model P is defined as the set of axisaligned hyperrectangles within [−1, 1]<sup>d</sup> (<sup>d</sup> ∈ {1, <sup>2</sup>, <sup>3</sup>} and the VC dimension is 2d), and the input distribution D is such that inputs are distributed uniformly over [−1, 1]d. We fix some probability mass <sup>b</sup> ∈ {0.05, <sup>0</sup>.1, <sup>0</sup>.2} and define the benchmarks so that the best error for a correct solution is exactly b (for details, see [9]).

We run our implementation using thresholds τ ∈ {0.07, 0.15, 0.3, 0.5, 1}, omitting those values for which τ<b; additionally, we also consider an adaptive run where τ is initialized as the value 1, and whenever a new best solution is enumerated with error k, we update τ ← k. Each combination of parameters was run for a period of 2 min. Figure 3 fixates on d = 1, b = 0.1 and shows each of the following as a function of time: (*i*) the depth completed by the search (i.e. the current size of the sample set), and (*ii*) the best solution found by the search. (See our full version of the paper [9] for other configurations of (d, b).)

**Fig. 3.** Synthetic hyperrectangle problem instance with parameters d = 1, b = 0.1.

By studying Fig. 3 we see that the adaptive threshold search performs at least as well as the tight thresholds fixed a priori because reasonable solutions are found early. In fact, all search configurations find solutions very close to the optimal error (indicated by the horizontal dashed line). Regardless, they reach different depths, and *the main advantage of reaching large depths concerns the strength of the optimality guarantee.* Note, also, that small τ values are necessary to see improvements in the completed depth of the search. Indeed, the discrepancy between the depth-versus-time functions diminishes drastically for the problem instances with larger values of b (See our full version of the paper [9]); the gains of the optimization are contingent on the existence of correct solutions *close* to the functional specification.

**Findings (RQ1):** <sup>τ</sup> -digits *does* tend to find *reasonable* solutions at early depths and near-optimal solutions at later depths, thus adaptive <sup>τ</sup> -digits is more effective than <sup>τ</sup> -digits, and we use it throughout our remaining experiments.

#### **5.2 Original DIGITS Benchmarks**

The original digits paper [1] evaluates on a set of 18 repair problems of varying complexity. The functional specifications are machine-learned decision trees and support vector machines, and each search space P involves the set of programs formed by replacing some number of real-valued constants in the program with holes. The postcondition is a form of *algorithmic fairness*—e.g., the program should output true on inputs of type A as often as it does on inputs of type <sup>B</sup> [11]. For each such repair problem, we run both digits and adaptive <sup>τ</sup> -digits (again, with initial τ = 1 and the identity refinement function). Each benchmark is run for 10 min, where the same sample set is used for both algorithms.

**Fig. 4.** Improvement of using adaptive <sup>τ</sup> -digits on the original digits benchmarks. Left: the dotted line marks the 2.4<sup>×</sup> average increase in depth.

Figure 4 shows, for each benchmark, (*i*) the largest sample set size completed by adaptive <sup>τ</sup> -digits versus digits (left—above the diagonal line indicates adaptive <sup>τ</sup> -digits reaches further depths), and (*ii*) the error of the best solution found by adaptive <sup>τ</sup> -digits versus digits (right—below the diagonal line indicates adaptive <sup>τ</sup> -digits finds better solutions). We see that adaptive <sup>τ</sup> -digits reaches further depths on every problem instance, many of which are substantial improvements, and that it finds better solutions on 10 of the 18 problems. For those which did not improve, either the search was already deep enough that digits was able to find near-optimal solutions, or the complexity of the synthesis queries is such that the search is still constrained to small depths.

**Findings (RQ2):** Adaptive <sup>τ</sup> -digits can find better solutions than those found by digits and can reach greater search depths.

#### **5.3 Thermostat Controller**

We challenge adaptive <sup>τ</sup> -digits with the task of synthesizing a thermostat controller, borrowing the benchmark from [7]. The input to the controller is the initial temperature of the environment; since the world is uncertain, there is a specified probability distribution over the temperatures. The controller itself is a program sketch consisting primarily of a single main loop: iterations of the loop correspond to timesteps, during which the synthesized parameters dictate an incremental update made by the thermostat based on the current temperature. The loop runs for 40 iterations, then terminates, returning the absolute value of the difference between its final actual temperature and the target temperature.

The postcondition is a Boolean probabilistic correctness property intuitively corresponding to controller safety, e.g. with high probability, the temperature should never exceed certain thresholds. In [7], there is a quantitative objective in the form of minimizing the expected value E[|*actual* − *target*|]—our setting does not admit optimizing with respect to expectations, so we must modify the problem. Instead, we fix some value N (N ∈ {2, 4, 8}) and have the program return 0 when |*actual* −*target*| < N and 1 otherwise. Our quantitative objective is to minimize the error from the constant-zero functional specification Pˆ(x) := 0 (i.e. the actual temperature always gets close enough to the target). The full specification of the controller is provided in the full version of our paper [9].

We consider variants of the program where the thermostat runs for fewer timesteps and try loop unrollings of size {5, 10, 20, 40}. We run each benchmark for 10 min: the final completed search depths and best error of solutions are shown in Fig. 5. For this particular experiment, we use the SMT solver CVC4 [3] because it performs better than Z3 on the occurring SMT instances.

**Fig. 5.** Thermostat controller results.

As we would expect, for larger values of N it is "easier" for the thermostat to reach the target temperature threshold and thus the quality of the best solution increases in N. However, with small unrollings (i.e. 5) the synthesized controllers do not have enough iterations (time) to modify the temperature enough for the probability mass of extremal temperatures to reach the target: as we increase the number of unrollings to 10, we see that better solutions can be found since the set of programs are capable of stronger behavior.

On the other hand, the completed depth of the search plummets as the unrolling increases due to the complexity of the Osyn queries. Consequently, for 20 and 40 unrollings, adaptive <sup>τ</sup> -digits synthesizes worse solutions because it cannot reach the necessary depths to obtain better guarantees.

One final point of note is that for N = 8 and 10 unrollings, it seems that there is a sharp spike in the completed depth. However, this is somewhat artificial: because N = 8 creates a very lenient quantitative objective, an early Osyn query happens to yield a program with an error less than 10−<sup>3</sup>. Adaptive <sup>τ</sup> -digits then updates <sup>τ</sup> ←≈ <sup>10</sup>−<sup>3</sup> and skips most synthesis queries.

**Findings (RQ3):** Adaptive <sup>τ</sup> -digits can synthesize small variants of a complex thermostat controller, but cannot solve variants with many loop iterations.

#### **6 Related Work**

**Synthesis and Probability.** Program synthesis is a mature area with many powerful techniques. The primary focus is on synthesis under Boolean constraints, and probabilistic specifications have received less attention [1,7,17,19]. We discuss the works that are most related to ours.

digits [1] is the most relevant work. First, we show for the first time that digits only requires a number of synthesis queries polynomial in the number of samples. Second, our adaptive <sup>τ</sup> -digits further reduces the number of synthesis queries required to solve a synthesis problem without sacrificing correctness.

The technique of *smoothed proof search* [7] approximates a combination of functional correctness and maximization of an expected value as a smooth, continuous function. It then uses numerical methods to find a local optimum of this function, which translates to a synthesized program that is likely to be correct and locally maximal. The benchmarks described in Sect. 5.3 are variants of benchmarks from [7]. Smoothed proof search can minimize expectation; τ digits minimizes probability only. However, unlike <sup>τ</sup> -digits, smoothed proof search lacks formal convergence guarantees and cannot support the rich probabilistic postconditions we support, e.g., as in the fairness benchmarks.

Works on synthesis of probabilistic programs are aimed at a different problem [6,19,23]: that of synthesizing a generative model of data. For example, Nori et al. [19] use sketches of probabilistic programs and complete them with a stochastic search. Recently, Saad et al. [23] synthesize an ensemble of probabilistic programs for learning Gaussian processes and other models.

Kˇucera et al. [17] present a technique for automatically synthesizing program transformations that introduce uncertainty into a given program with the goal of satisfying given privacy policies—e.g., preventing information leaks. They leverage the specific structure of their problem to reduce it to an SMT constraint solving problem. The problem tackled in [17] is orthogonal to the one targeted in this paper and the techniques are therefore very different.

**Stochastic Satisfiability.** Our problem is closely related to e-majsat [18], a special case of *stochastic satisfiability* (ssat) [20] and a means for formalizing probabilistic planning problems. e-majsat is of nppp complexity. An e-majsat formula has deterministic and probabilistic variables. The goal is to find an assignment of deterministic variables such that the probability that the formula is satisfied is above a given threshold. Our setting is similar, but we operate over complex program statements and have an additional optimization objective (i.e., the program should be close to the functional specification). The deterministic variables in our setting are the holes defining the search space; the probabilistic variables are program inputs.

**Acknowledgements.** We thank Shuchi Chawla, Yingyu Liang, Jerry Zhu, the entire fairness reading group at UW-Madison, and Nika Haghtalab for all of the detailed discussions. This material is based upon work supported by the National Science Foundation under grant numbers 1566015, 1704117, and 1750965.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Membership-Based Synthesis of Linear Hybrid Automata**

Miriam Garc´ıa Soto(B) , Thomas A. Henzinger , Christian Schilling , and Luka Zeleznik

> IST Austria, Klosterneuburg, Austria {miriam.garciasoto,tah,christian.schilling, luka.zeleznik}@ist.ac.at

**Abstract.** We present two algorithmic approaches for synthesizing linear hybrid automata from experimental data. Unlike previous approaches, our algorithms work without a template and generate an automaton with nondeterministic guards and invariants, and with an arbitrary number and topology of modes. They thus construct a succinct model from the data and provide formal guarantees. In particular, (1) the generated automaton can reproduce the data up to a specified tolerance and (2) the automaton is tight, given the first guarantee. Our first approach encodes the synthesis problem as a logical formula in the theory of linear arithmetic, which can then be solved by an smt solver. This approach minimizes the number of modes in the resulting model but is only feasible for limited data sets. To address scalability, we propose a second approach that does not enforce to find a minimal model. The algorithm constructs an initial automaton and then iteratively extends the automaton based on processing new data. Therefore the algorithm is well-suited for online and synthesis-in-the-loop applications. The core of the algorithm is a membership query that checks whether, within the specified tolerance, a given data set can result from the execution of a given automaton. We solve this membership problem for linear hybrid automata by repeated reachability computations. We demonstrate the effectiveness of the algorithm on synthetic data sets and on cardiac-cell measurements.

**Keywords:** Synthesis · Linear hybrid automaton · Membership

#### **1 Introduction**

Natural sciences pursue to understand the mechanisms of real systems and to make this understanding accessible. Achieving these two goals requires observation, analysis, and modeling of the system. Typically, physical components of a

This research was supported in part by the Austrian Science Fund (FWF) under grants S11402-N23 (RiSE/SHiNE) and Z211-N23 (Wittgenstein Award) and the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No. 754411.

system evolve continuously in real time, while the system may switch among a finite set of discrete states. This applies to cyber-physical systems but also to purely analog systems; e.g., an animal's hunger affects its movement. A proper formalism for modeling such types of systems with mixed discrete-continuous behavior is a hybrid automaton [11]. Unlike black-box models such as neural networks, hybrid automata are easy to interpret by humans. However, designing such models is a time-intensive and error-prone process, usually conducted by an expert who analyzes the experimental data and makes decisions.

In this paper, we propose two automatic approaches for synthesizing a linear hybrid automaton [1] from experimental data. The approaches provide two main properties. The first property is *soundness*, which ensures that the generated model has enough executions: these executions approximate the given data up to a predefined accuracy. The second property is *precision*, which ensures that the generated model does not have too many executions. The behavior of a hybrid automaton is constrained by so-called invariants and guards. *Precision* expresses that the boundaries of these invariants and guards are witnessed by the data, which indicates that the constraints cannot be made tighter. Moreover, the proposed synthesis algorithm is *complete* for a general class of linear hybrid automata, i.e., the algorithm can synthesize any given model from this class.

The first approach reduces the synthesis problem to a satisfiability question for a linear-arithmetic formula. The formula allows us to encode a minimality constraint (namely in the number of so-called modes) on the resulting model. This approach is, however, not scalable, which motivates our second approach. Our second approach follows an iterative model-adaptation scheme. Apart from scalability advantages, this *online* algorithm is thus also well-suited for synthesis-in-the-loop applications.

After constructing an initial model, the second approach iteratively improves and expands the model by considering new experiments. After each iteration, the model will capture all behaviors exhibited in the previous experiments. Given an automaton and new experimental data, the algorithm proceeds as follows. First we ask whether the current automaton already captures the data. We pose this question as a membership query for a piecewise-linear function in the set of executions of the automaton. For the membership query, we present an algorithm based on reachability inside a tube around the function. If the data is not captured, we need to modify the automaton accordingly by adding behavior. We first try to relax the above-mentioned invariants and guards, which we reduce to another membership query. If that query is negative as well, we choose a path in the automaton that closely resembles the given data and then modify the automaton along that path by also adding new discrete structure (called modes and transitions). This modification step is again guided by membership queries to identify the aspects of the model that require improvement and expansion.

As the main contributions, (1) we present an online algorithm for automatic synthesis of linear hybrid automata from data that is *sound*, i.e., guarantees that the generated model approximates the data up to a user-defined threshold, *precise*, i.e., the generated model is tight, and *complete* for a general class of models (2) we solve the membership problem of a piecewise-linear function in a linear hybrid automaton. This is a critical step in our synthesis algorithm

*Related Work.* The synthesis of hybrid systems was initially studied in control theory under the term *identification*, mainly focused on (discrete-time) switched autoregressive exogenous (SARX) and piecewise-affine autoregressive exogenous (PWARX) models [7,18]. SARX models constitute a subclass of linear hybrid automata with deterministic switching behavior. PWARX models are specific SARX models where the mode invariants form a state-space partition. Fixing the number of modes, the identification problem from input-output data can be solved algebraically by inferring template parameters. However, in contrast to linear hybrid automata, the lack of nondeterminism and the underlying assumption that there is no hidden state (mode) limits the applicability of these models. An algorithm by Bemporad et al. constructs a PWARX model that satisfies a *global* error bound [5]. Ozay presents an algorithm for SARX models where the switching is purely time-triggered [17]. There also exist a few *online* algorithms for the recursive synthesis of PWARX models based on pattern recognition [19] or lifting to a high-dimensional identification problem for ARX models [10,22].

Synthesis is also known as *process mining*, and as *learning models from traces*; the latter refers to approaches based on learning finite-state machines [3] or other machine-learning techniques. More recently, synthesis of hybrid automaton models has gained attention. All existing approaches that we are aware of have structural restrictions of some sort, which we describe below. We synthesize, for the first time, a general class of linear hybrid automata which (1) allows nondeterminism to capture many behaviors by a *concise* representation and (2) provides formal soundness and precision guarantees. The algorithm is also the first *online* synthesis approach for linear hybrid automata.

The general synthesis problem for hybrid automata is hard: for deterministic timed automata (a subclass of linear hybrid automata with globally identical continuous dynamics), one may already require data of exponential length [21]. The approach by Niggemann et al. constructs an automaton with acyclic discrete structure [16], while the approach by Grosu et al., intended to model purely periodic behavior, constructs a cyclic-linear hybrid automaton whose discrete structure consists of a loop [8]. Ly and Lipson use symbolic regression to infer a non-linear hybrid automaton [14]. However, their model neither contains state variables (i.e., the model is purely input-driven, comparable to the SARX model) nor invariants, and the number of modes needs to be fixed in advance. Medhat et al. describe an abstract framework, based on heuristics, to learn linear hybrid automata from input/output traces [15]. They first employ Angluin's algorithm for learning a finite-state machine [3], which serves as the discrete structure of the hybrid automaton, before they decorate the automaton with continuous dynamics. This strict separation inherently makes their approach offline. The work by Summerville et al. based on least-squares regression requires an exhaustive construction of all possible models for later optimizing a cost function over all of them [20]. Lamrani et al. learn a completely deterministic model with urgent transitions using ideas from information theory [12].

#### **2 Preliminaries**

*Sets.* Let R, R-<sup>0</sup>, and N denote the set of real numbers, non-negative real numbers, and natural numbers, respectively. We write **x** for points (x1,...,xn) in Rn. Let cpoly(n) be the set of compact and convex polyhedral sets over Rn. A set X ∈ cpoly(n) is characterized by its set of vertices vert(X). For a set of points Y , chull(Y ) ∈ cpoly(n) denotes the convex hull. Given a set X ∈ cpoly(n) and <sup>ε</sup> <sup>∈</sup> <sup>R</sup>-<sup>0</sup>, we define the <sup>ε</sup>*-bloating* of <sup>X</sup> as X<sup>ε</sup> := {**<sup>x</sup>** <sup>∈</sup> <sup>R</sup><sup>n</sup> | ∃**x**<sup>0</sup> <sup>∈</sup> <sup>X</sup> : **x** − **x**0 ε} ∈ cpoly(n), where · is the infinity norm. Given an interval I = [l, u] ∈ cpoly(1), lb(I) = l and ub(I) = u denote its lower and upper bound.

*Functions and Sequences.* Given a function f, let dom(f) resp. img(f) denote its domain resp. image. Let f<sup>A</sup> denote the restriction of f to domain A ⊆ dom(f). We define a *distance* between functions f and g with the same domain and codomain by d(f,g) := max<sup>t</sup>∈dom(f) f(t) − g(t). A *sequence* of length m is a function <sup>s</sup> : <sup>D</sup> <sup>→</sup> <sup>A</sup> over an ordered finite domain <sup>D</sup> <sup>=</sup> {i1,...,im} ⊆ <sup>N</sup> and a set A, and we write len(s) to denote the length of s. A sequence s is also represented by enumerating its elements, as in s(i1),...,s(im).

*Affine and Piecewise-Linear Functions.* An *affine piece* is a function <sup>p</sup> : <sup>I</sup> <sup>→</sup> <sup>R</sup><sup>n</sup> over an interval <sup>I</sup> = [t0, t1] <sup>⊆</sup> <sup>R</sup> defined as <sup>p</sup>(t) = **<sup>a</sup>**<sup>t</sup> <sup>+</sup> **<sup>b</sup>** where **<sup>a</sup>**, **<sup>b</sup>** <sup>∈</sup> <sup>R</sup><sup>n</sup>. Given an affine piece p, init(p) denotes the start point p(t0), end(p) denotes the end point p(t1), and slope(p) denotes the slope **a**. We call two affine pieces p and p *adjacent* if end(p) = init(p ) and ub(dom(p)) = lb(dom(p )). For <sup>m</sup> <sup>∈</sup> <sup>N</sup>, an <sup>m</sup>*-piecewise-linear (* <sup>m</sup>-pwl*) function* <sup>f</sup> : <sup>I</sup> <sup>→</sup> <sup>R</sup><sup>n</sup> over interval <sup>I</sup> = [0,T] <sup>⊆</sup> <sup>R</sup> consists of m affine pieces p1,...,pm, such that I = ∪<sup>1</sup>j<sup>m</sup>dom(p<sup>j</sup> ), f(t) = p<sup>j</sup> (t) for t ∈ dom(p<sup>j</sup> ), and for every 1 < j m we have end(p<sup>j</sup>−<sup>1</sup>) = init(p<sup>j</sup> ). We show a 3-pwl function in Fig. 1 on the left. Let pieces(f) denote the set of affine pieces of f. We refer to f and the sequence p1,...,p<sup>m</sup> interchangeably and write "pwl function" if m is clear from the context. A *kink* of a pwl function is the point between two adjacent pieces. Given a pwl function <sup>f</sup> : <sup>I</sup> <sup>→</sup> <sup>R</sup><sup>n</sup> and a value <sup>ε</sup> <sup>∈</sup> <sup>R</sup>-<sup>0</sup>, the ε*-tube* of f is the function tubef,ε : I → cpoly(n) such that tubef,ε(t) = f(t)ε.

*Graphs.* A *graph* is a pair (V,E) of a finite set V and a relation E ⊆ V × V . A *path* π in (V,E) is a sequence v1,...,v<sup>m</sup> with (v<sup>j</sup>−<sup>1</sup>, v<sup>j</sup> ) ∈ E for 1 < j m.

*Hybrid Automata.* We consider a particular class of hybrid automata [1,11].

**Definition 1.** *<sup>A</sup>* <sup>n</sup>*-dimensional* linear hybrid automaton (lha) *is a tuple* <sup>H</sup> <sup>=</sup> (*Q*,*E*, *X*, *Flow*,*Inv*, *Grd*)*, where (1) Q is a finite set of modes, (2) E* ⊆ *Q*×*Q is a transition relation, (3) X* <sup>=</sup> <sup>R</sup><sup>n</sup> *is the continuous state-space, (4) Flow* : *<sup>Q</sup>* <sup>→</sup> <sup>R</sup><sup>n</sup> *is the flow function, (5) Inv* : *Q* → cpoly(n) *is the invariant function, and (6) Grd* : *E* → cpoly(n) *is the guard function*

We sometimes annotate the elements of lha <sup>H</sup> by a subscript, as in *<sup>Q</sup>*<sup>H</sup> for the set of modes. We refer to (*Q*H,*E*H) as the *graph of* lha <sup>H</sup>.

An lha evolves continuously according to the flow function in each mode. The behavior starts in some mode q ∈ *Q* and some continuous state **x** ∈ *Inv*(q). For every mode q ∈ *Q*, the continuous evolution follows the differential equation **x**˙ = *Flow*(q) while satisfying the invariant *Inv*(q). The behavior can switch from one mode q<sup>1</sup> to another mode q<sup>2</sup> if there is a transition (q1, q2) ∈ *E* and the guard *Grd*((q1, q2)) is satisfied. During a switch, the continuous state does not change. This type of system is sometimes called a switched linear hybrid system [13].

**Definition 2.** *Given an* <sup>n</sup>*-dimensional* lha <sup>H</sup> = (*Q*,*E*, *<sup>X</sup>*, *Flow*,*Inv*, *Grd*)*, an* execution σ *is a triple* σ = (I, γ,δ)*, where* I *is a sequence of consecutive intervals* [t0, t1], [t1, t2],..., [t<sup>m</sup>−<sup>1</sup>, tm] *with* [[I]] = <sup>∪</sup><sup>0</sup>j<m[t<sup>j</sup> , tj+1]*, and* <sup>γ</sup> : [[I]] <sup>→</sup> <sup>R</sup><sup>n</sup> *and* δ : {1,...,m} → *Q are functions with the following restrictions:*


We denote the set of all executions of <sup>H</sup> by exec(H). Given an lha <sup>H</sup>, we say that an execution σ *follows a path* π in H, that is, in the graph (*Q*H,*E*H), denoted as σ <sup>H</sup> π, if len(I) = len(π) and δ(j) = π(j) for every 0 j < len(I).

*From Time-series Data to* pwl *Functions.* Experimental data typically comes as *time series*, i.e., data is only available at sampled points in time. A time series is a sampling <sup>s</sup> : <sup>D</sup> <sup>→</sup> <sup>R</sup><sup>n</sup> over a finite time domain <sup>D</sup> <sup>⊆</sup> [0,T]. Since the lha model features piecewise-linear executions, we focus on piecewise-linear approximation of the data. pwl functions can approximate any continuous behavior with arbitrary precision. There are different yet valid choices for approximating data. For a single time series, linear interpolation gives a perfect fit, but contains many kinks; other algorithms minimize the number of kinks for a given error bound [6,9]. One can preprocess multiple time series into a single pwl function using, e.g., linear regression. In this paper, we leave the choice of abstraction open and assume that the input is given as pwl functions.

#### **3 Synthesis of Linear Hybrid Automata**

In this section, we specify the synthesis problem, consider two different specifications, synchronous and asynchronous, and present the automated approach for solving the synchronous problem. The overall goal is to synthesize a linear hybrid automaton from a set of pwl functions such that the automaton *captures* the behavior described by each of the pwl functions up to a bound ε.

**Definition 3 (Soundness).** *Given a* pwl *function* <sup>f</sup> *and a value* <sup>ε</sup> <sup>∈</sup> <sup>R</sup>-0*, we say that an* lha <sup>H</sup> <sup>ε</sup>*-*captures <sup>f</sup> *if there exists an execution* <sup>σ</sup> = (I, γ,δ) *in* exec(H) *with* d(f, γ) ε*.*

The value ε quantifies the acceptable deviation of an execution's continuous function γ from the pwl function f. For ε = 0, γ must precisely follow f. A straightforward formulation of the problem we want to solve is the following.

*Problem 1 (Synthesis).* Given a finite set of pwl functions <sup>F</sup> and <sup>ε</sup> <sup>∈</sup> <sup>R</sup>-0, construct an lha <sup>H</sup> that <sup>ε</sup>-captures every function <sup>f</sup> ∈ F.

Observe that this problem is not well-posed, as it can be satisfied by an automaton that exhibits an excessive amount of behavior. Hence our second goal for the synthesis algorithm is to ensure constraints on the automaton's size. We start with the synthesis of an lha with minimal number of modes.

#### **3.1 Synchronous Switching Specification**

For now, we require that the executions in the lha switch *synchronously* with the given pwl functions. Under this assumption, we tackle a refinement of Problem 1:

*Problem 2 (Synchronous synthesis).* Given a finite set of pwl functions <sup>F</sup> and a value <sup>ε</sup> <sup>∈</sup> <sup>R</sup>-<sup>0</sup>, construct an lha <sup>H</sup> that <sup>ε</sup>-captures every function <sup>f</sup> ∈ F synchronously, and furthermore require that H has the minimal number of modes.

In the following, we present an algorithm to solve Problem 2. The idea is, given a pwl function f, to synthesize an execution σ that is ε-close to f. Recall that the continuous function γ of an execution is essentially just another pwl function. Any lha that contains the execution σ has to comprise a mode for each different slope in γ. Thus a minimal number of modes can be achieved by minimizing the number of different slopes in γ. By fixing a number of different slopes, we encode the existence of γ as a logical formula φf,ε, which will be satisfiable if and only if there exists a suitable function γ.

Let m be the number of affine pieces p1,...,p<sup>m</sup> in f with dom(p<sup>j</sup> )=[t<sup>j</sup>−1, t<sup>j</sup> ] for 1 j m. We refer to the time instants t<sup>j</sup> as the switching times of f, and to **<sup>x</sup>**<sup>j</sup> <sup>=</sup> <sup>f</sup>(t<sup>j</sup> ) as the switching points of <sup>f</sup>. Fixing a number <sup>∈</sup> <sup>N</sup>, we want to construct a pwl function γ, consisting of m affine pieces p 1,...,p m with different slopes, with the same switching times as in f, with switching points **y**0,..., **y**<sup>m</sup> ε-close to those in f (which is necessary and sufficient for d(f, γ) ε), and with unknown slopes **b**<sup>1</sup> = slope(p <sup>1</sup>),..., **b**<sup>m</sup> = slope(p <sup>m</sup>). We define the logical formula

$$\phi\_{f,\varepsilon}(\ell) \coloneqq \bigwedge\_{j=1}^{m} \mathbf{y}\_j = \mathbf{y}\_{j-1} + \mathbf{b}\_j (t\_j - t\_{j-1}) \wedge \bigwedge\_{j=0}^{m} \mathbf{y}\_j \in [\mathbf{x}\_j]\_{\varepsilon} \wedge \bigwedge\_{j=1}^{m} \bigvee\_{k=1}^{\ell} \mathbf{b}\_j = \mathbf{c}\_k,$$

which is satisfiable if and only if there exists a suitable pwl function γ. For lifting to a set of functions F, we define the formula φF,ε() := <sup>f</sup>∈F <sup>φ</sup>f,ε(). These formulae fall into the theory of linear arithmetic and can be effectively solved by an smt solver. Now, we can state the following results.

**Lemma 1.** *Let* <sup>F</sup> *be a finite set of* pwl *functions and* <sup>ε</sup> <sup>∈</sup> <sup>R</sup>-<sup>0</sup>*. If* φF,ε() *is satisfiable for some integer value , then there exists a set of* pwl *functions* <sup>F</sup> *such that* |F | = |F|*, each function in* F *is* ε*-close to some function in* F *, and the number of distinct slopes in* F *does not exceed .*

The set F can be extracted from a satisfying assignment. We define a hybrid automaton with minimal number of locations 0-capturing a given pwl function.

**Definition 4 (Canonical automaton).** *Let* f *be an* n*-*pwl *function. The* canonical automaton of <sup>f</sup> *is* <sup>H</sup><sup>f</sup> := (*Q*,*E*, <sup>R</sup>n, *Flow*,*Inv*, *Grd*) *with*


**Lemma 2.** *Given a* pwl *function* <sup>f</sup>*, the canonical automaton* <sup>H</sup><sup>f</sup> <sup>0</sup>*-captures* <sup>f</sup>*, and every* lha *that* <sup>0</sup>*-captures* <sup>f</sup> *has at least as many modes as* <sup>H</sup><sup>f</sup> *.*

**Definition 5 (Merging).** *Given two hybrid automata* H<sup>i</sup> = (*Q*i,*E*i, *X*, *Flow*i, *Inv*i, *Grd*i)*,* i = 1, 2 *with Q*<sup>1</sup> ∩ *Q*<sup>2</sup> = ∅*, let Q***<sup>a</sup>** = Q<sup>H</sup><sup>1</sup> **<sup>a</sup>** ∪ Q<sup>H</sup><sup>2</sup> **<sup>a</sup>** *be the locations with flow equal to* **a***. We define the* merging of H1*and* H<sup>2</sup> *as* H<sup>1</sup> H<sup>2</sup> := (*Q*,*E*, *<sup>X</sup>*, *Flow*,*Inv*, *Grd*) *with Q* <sup>=</sup> {q**<sup>a</sup>** <sup>|</sup> **<sup>a</sup>** <sup>∈</sup> <sup>R</sup><sup>n</sup>, Q**<sup>a</sup>** <sup>=</sup> ∅}*, E* <sup>=</sup> {(q**a**, q**<sup>a</sup>**- ) | ∃(q, q ) ∈ E<sup>1</sup> ∪ E2, q ∈ Q**a**, q ∈ Q **<sup>a</sup>**}*, Flow*(q**a**) = **a***, Inv*(q**a**) = chull({*Inv*i(q) | q ∈ Q**a**, i = 1, 2})*, and Grd*((q**a**, q**<sup>a</sup>**- )) = chull({*Grd*i((q, q )) | (q, q ) ∈ Ei*,* q ∈ Q**a**, q ∈ *Q***<sup>a</sup>**-, i = 1, 2})*.*

**Theorem 1.** *Given a finite set of* pwl *functions* <sup>F</sup> *and a value* <sup>ε</sup> <sup>∈</sup> <sup>R</sup>-<sup>0</sup>*, let be the smallest integer such that* φF,ε() *is satisfiable and let* F *be a set of* pwl *functions corresponding to a satisfying assignment. Then, the merging of canonical automata* <sup>f</sup>∈F-H<sup>f</sup> *solves Problem 2.*

The above synthesis algorithm works well with short and low-dimensional pwl functions but does not scale to realistic problem sizes due to the heavy use of disjunctions. We next address scalability with a new online algorithm.

#### **3.2 Asynchronous Switching Specification**

We now change the requirement from the previous subsection (minimality in the models' discrete structure) to tightness in the model's state-space constraints. Intuitively, for every vertex **v** of an invariant or guard in H there should be some witness data f ∈ F that is close to **v** (at some point in time).

**Definition 6 (Precision).** *Given an* lha <sup>H</sup> = (*Q*,*E*, *<sup>X</sup>*, *Flow*,*Inv*, *Grd*)*, let vert*(H) *denote the union of the vertices of the invariants and guards:*

$$\operatorname{vert}(\mathcal{H}) = \bigcup\_{q \in Q} \operatorname{vert}(\operatorname{Inv}(q)) \cup \bigcup\_{e \in E} \operatorname{vert}(\operatorname{Grad}(e))$$

*Given a set of* pwl *functions* <sup>F</sup> *and a value* <sup>ε</sup> <sup>∈</sup> <sup>R</sup>-<sup>0</sup>*, we say that* H *is* ε -precise *(with respect to* F*) if the following holds:*

$$\forall v \in \mathfrak{vert}(\mathcal{H}) \; \exists f \in \mathcal{F} \; \exists t \in \mathsf{dom}(f) : \left\| v - f(t) \right\| \leqslant \varepsilon.$$

The restriction to the vertices is reasonable because all sets are compact convex polyhedra. Note that ε-capturing compares functions to the automaton's executions, while ε-precision compares functions to the automaton's state-space.

We also relax the limitation to synchronously switching executions. Instead, we allow *asynchronous* switching, characterized as follows: for every function f ε-captured by H, there exists an execution σ ∈ exec(H) with the same number of switches as there are kinks in f, i.e., len(I) = |pieces(f)|, and where the j-th switch in the execution should take place during the time period between the kinks j − 1 and j + 1. We close this section with the new problem statement (a refinement of Problem 1), and present a solution in the next section.

*Problem 3 (Asynchronous synthesis).* Given a finite set of pwl functions <sup>F</sup> and a value <sup>ε</sup> <sup>∈</sup> <sup>R</sup>-<sup>0</sup>, construct an <sup>ε</sup>-precise lha <sup>H</sup> that <sup>ε</sup>-captures every function f ∈ F asynchronously.

#### **4 Membership-based Synthesis Approach**

In this section, we present an algorithm for solving Problem 3. The core of the algorithm is a reachability computation for providing the polyhedral regions where executions of an lha that are ε-close to a given pwl function f are allowed to switch. More precisely, given a path π and the ε-tube of f, the algorithm iteratively constructs the set inside the ε-tube where an execution following π can switch, without escaping from the tube. These reachable set are, in general, computed with respect to a starting compact convex polyhedron P, a pair of adjacent affine pieces p and p , and a pair of modes q and q along π.

**Definition 7.** *Given an* lha <sup>H</sup> = (*Q*,*E*, *<sup>X</sup>*, *Flow*,*Inv*, *Grd*) *and a value* <sup>ε</sup> <sup>∈</sup> <sup>R</sup>-0*, a* reachable switching set *switch*H(P, p, p , q, q ) *from a set* P *with respect to two adjacent affine pieces* p, p *and a path* π := q, q *in* H *is defined as*

$$\begin{split} \{ \mathbf{x} \in Grd((q, q')) \mid \exists \sigma = (\mathcal{Z}, \gamma, \delta) \in \texttt{succ}(\mathcal{H}) : \sigma \stackrel{\mathcal{H}}{\leadsto} \pi, \texttt{dom}(\gamma) = \texttt{dom}(p) \cup \texttt{dom}(p'), \\ \gamma(0) \in P, \gamma(t) \in \texttt{true}\_{p, \varepsilon}(t) \cup \texttt{true}\_{p', \varepsilon}(t), \text{ and } \texttt{x} = \gamma(\texttt{ub}(\mathcal{Z}(0))) \}. \end{split}$$

**Inductive Reachable Switching Computation.** Given an lha <sup>H</sup>, an <sup>m</sup>-pwl function <sup>f</sup> <sup>=</sup> <sup>p</sup>1,...,pm, a value <sup>ε</sup> <sup>∈</sup> <sup>R</sup>-<sup>0</sup> and a path π = q1,...,q<sup>m</sup> in the graph (*Q*H,*E*H), we compute the reachable switching set <sup>P</sup> <sup>π</sup> <sup>j</sup> for every 0 j m:

$$\begin{array}{l} -\ P\_{0}^{\pi} := Inv\_{\mathcal{H}}(q\_{1}) \cap \mathtt{pub}\_{f,\varepsilon}(0),\\ -\ P\_{j}^{\pi} := switch\_{\mathcal{H}}(P\_{j-1}^{\pi}, p\_{j-1}, p\_{j}, q\_{j-1}, q\_{j}) \text{ for } 1 < j < m, \text{ and} \\ -\ P\_{m}^{\pi} := \{\mathtt{x} \in Inv(q\_{m}) \mid \exists \sigma = (\mathcal{I}, \gamma, \delta) \in \mathtt{exec}(\mathcal{H}) : \sigma \stackrel{\mathcal{H}}{\leadsto} q\_{m}, \gamma(0) \in P\_{m-1}^{\pi}, \\ \mathsf{dom}(\gamma) = \mathsf{dom}(p\_{m}), \ \gamma(t) \in \mathtt{pub}\_{p\_{m},\varepsilon}(t) \text{ and } \mathbf{x} = \gamma(\mathsf{ub}(\mathcal{I}(m))) \}. \end{array}$$

We denote the set of all reachable switching sets P <sup>π</sup> <sup>j</sup> by <sup>P</sup><sup>π</sup>. We are now ready to present the complete synthesis algorithm.


#### **4.1 Membership-based Synthesis Algorithm**

The synthesis algorithm outlined in Algorithm <sup>1</sup> computes an lha <sup>H</sup> solving Problem <sup>3</sup> for a given finite set of pwl functions <sup>F</sup> and a value <sup>ε</sup> <sup>∈</sup> <sup>R</sup>-<sup>0</sup>. The algorithm initially infers an lha <sup>H</sup> that <sup>ε</sup>-captures the first function <sup>f</sup><sup>0</sup> of <sup>F</sup> in an ε-precise manner in line 1. The remaining pwl functions are handled in an iterative loop. For each pwl function f, the algorithm performs a membership query, where it checks if <sup>f</sup> is <sup>ε</sup>-captured by the lha <sup>H</sup> in line 3. If the query results in a positive answer (*ans* = *True*), nothing needs to be done. Otherwise, the query returns a path <sup>π</sup> and the lha <sup>H</sup> needs to be modified. The modification of the automaton H is performed in two attempts. The first attempt, in line 5, temporarily increases invariants and guards of H. If such a modification is sufficient to let the membership query succeed, the modifications are made permanent in line 8. Otherwise, in the second attempt the algorithm adds new modes and/or transitions to H along the path π. Below we describe every procedure of Algorithm 1 in detail.

**Initialization.** The procedure InitLha(f,ε) constructs an initial lha <sup>H</sup> that ε-captures an m-pwl function f. Observe that by Lemma 2 the canonical automaton H<sup>f</sup> 0-captures (and hence ε-captures) the function f. In order to allow similar dynamical behaviors in a given lha <sup>H</sup>, the procedure InitLha(f,ε) ε-bloats both invariant and guards polyhedra. The procedure InitLha(f,ε) outputs the <sup>ε</sup>-bloated canonical automaton <sup>H</sup><sup>ε</sup> <sup>f</sup> and is illustrated in Fig. 1.

**Definition 8.** *Given an* lha <sup>H</sup> = (*Q*,*E*, *<sup>X</sup>*, *Flow*,*Inv*, *Grd*)*, we define the* <sup>ε</sup> -bloated lha of <sup>H</sup> *as* <sup>H</sup><sup>ε</sup> = (*Q*,*E*, *<sup>X</sup>*, *Flow*,*Inv* <sup>ε</sup> , *Grd* <sup>ε</sup> ) *where Inv* <sup>ε</sup> (q) = *Inv*(q)<sup>ε</sup> *for every* <sup>q</sup> <sup>∈</sup> *Q and Grd* <sup>ε</sup> (e) = *Grd*(e)<sup>ε</sup> *for every* e ∈ *E.*

**Lemma 3.** *Given a* pwl *function* <sup>f</sup> *and* <sup>ε</sup> <sup>∈</sup> <sup>R</sup>-<sup>0</sup>*,* <sup>H</sup><sup>ε</sup> <sup>f</sup> ε*-captures* f*.*

**Fig. 1.** Example describing the procedure InitLha(f,ε) for a 3-pwl function f = f<sup>0</sup> (depicted on the left). The function f<sup>0</sup> consists of three pieces p0, p1, p<sup>2</sup> with slopes 1, 0, 1, respectively. The lha on the right is constructed as follows. Mode q<sup>0</sup> corresponds to pieces p<sup>0</sup> and p2; the invariant is the ε-bloating of interval [1, 3] (which is the convex hull of every start and end point in both pieces). Likewise, mode q<sup>1</sup> corresponds to piece p1. Transitions and their guards correspond to the kinks of f<sup>0</sup> at t = 1 and t = 2.

**Membership.** The procedure Membership(f, <sup>H</sup>, ε) checks whether there exists an *asynchronous* execution σ = (I, γ,δ) in H such that d(f, γ) ε holds. Let us introduce the required notions to formalize the membership problem.

**Definition 9.** *An execution* <sup>σ</sup> = (I, γ,δ) *of an* lha <sup>H</sup> *is* consistent *with an* <sup>m</sup>*-*pwl *function* <sup>f</sup>*, described by the affine pieces* <sup>p</sup>1,...,pm*, if* len(I) = <sup>m</sup>*,* [[I]] = dom(f)*, and* ub(I(j)) ∈ dom(p<sup>j</sup> ) ∪ dom(pj+1) *for every* 1 j<m*.*

*Problem 4 (Membership).* Given an <sup>m</sup>-pwl function <sup>f</sup>, an lha <sup>H</sup>, and a value <sup>ε</sup> <sup>∈</sup> <sup>R</sup>-<sup>0</sup>, decide if there exists an execution σ = (I, γ,δ) in exec(H) that is consistent with f and such that d(f, γ) ε holds.

The procedure Membership(f, <sup>H</sup>, ε) solves Problem <sup>4</sup> by computing the reachable switching sets for every path π of length m in H until finding a path π where every reachable switching set P <sup>π</sup> <sup>j</sup> for 0 j m is nonempty. Upon finding a path <sup>π</sup> satisfying the previous constraints, Membership(f, <sup>H</sup>, ε) returns *True* as answer, together with the path π. If there does not exist such a path π, it returns *False* as answer. We show an example in Fig. 2(a). We remark that, for a fixed path, Problem 4 is a timestamp-generation problem [2] with the restriction to time intervals for switching and the ε-tube as solution corridor.

**Lemma 4.** *Let* <sup>H</sup> *be an* lha *and* <sup>f</sup> *be an* <sup>m</sup>*-*pwl *function. Then there exists a path* <sup>π</sup> *of length* <sup>m</sup> *in* <sup>H</sup> *such that the final reachable switching set* <sup>P</sup> <sup>π</sup> <sup>m</sup> *is not empty if and only if there exists an execution* σ *in* exec(H) *solving Problem 4.*

**Relaxation.** If Membership(f, <sup>H</sup>, ε) returns *False*, RelaxAll(H,f,ε) constructs an automaton H that is equivalent to H except that its invariants and guards are enlarged to allow additional executions inside the tubef,ε. Then, the algorithm computes Membership(f, <sup>H</sup>, ε). If the answer is *False* again, the algorithm proceeds to the adaptation procedure in line 10. Otherwise (if the answer is *True*), we obtain a path π in H. Then the algorithm executes the procedure RelaxPath(H, f, ε, π), which extends the constraints of invariants and guards

**Fig. 2. (a)** Example describing the procedure Membership(f, <sup>H</sup>, ε). On the left we depict a 3-pwl function f<sup>1</sup> and its ε-tube. On the right we show a possible execution in the lha from Fig. 1. **(b)** Given an affine piece p, we say that another piece has a *similar* slope if it does not leave the tube. In the figure, we show the minimal and the maximal allowed slopes by dashed segments.

in H for the modes in π by taking the convex hull with the corresponding reachable switching sets P <sup>π</sup> <sup>j</sup> ∈ P<sup>π</sup>. The relaxation procedure applied on the running example is shown in Fig. 3.

**Adaptation.** If both the membership query and the relaxation procedure fail, the procedure Adapt(H, f, ε, π) modifies the lha <sup>H</sup> for <sup>ε</sup>-capturing <sup>f</sup>. Conceptually, we construct a new path π , based on some path π, and modify H accordingly such that the graph of H contains π . Recalling Lemma 4, we need to ensure that every reachable switching set in <sup>P</sup><sup>π</sup>- is nonempty. We construct π by trying to preserve the modes in path π. If this is not possible, we try to replace them by existing modes in the lha <sup>H</sup> whenever possible, potentially adding new transitions. The last option is to create new modes. Finally, we extend the lha H by adding the new transitions and/or modes determined by the new path π .

In more detail, given an lha <sup>H</sup>, an <sup>m</sup>-pwl function <sup>f</sup> and a path <sup>π</sup> <sup>=</sup> q1,...,q<sup>m</sup> in H, we start with path π = π. Then, the adaptation procedure checks whether there is an empty reachable switching set in <sup>P</sup><sup>π</sup>- . Every time we detect emptiness of the set P <sup>π</sup>- <sup>j</sup> for some 0 j m, a mode in the path π is replaced in order to make P <sup>π</sup>- <sup>j</sup> nonempty. We first try to replace the mode qj+1 if it exists. If P <sup>π</sup>- <sup>j</sup> is still empty or qj+1 does not exist, we repeat the replacement for q<sup>j</sup> , q<sup>i</sup>−<sup>1</sup>, and so on, until P <sup>π</sup>- <sup>j</sup> finally becomes nonempty.

For the replacement of the j-th mode q in the path π we follow two strategies. The first strategy is to replace the mode q by an existing mode q = q in H such that *Flow*H(q ) is *similar* to slope(p<sup>j</sup> ). Formally, let T be the duration of piece p<sup>j</sup> . *Flow*H(q ) is similar to slope(p<sup>j</sup> ) if init(p<sup>j</sup> )+T·*Flow*H(q )−end(p<sup>j</sup> ) - 2ε. See Fig. 2(b) for an example. If the first strategy fails, the second strategy is to create a new mode q<sup>∗</sup> with flow newflow(q∗) = slope(p<sup>j</sup> ) for replacement in π . We denote the set of existing modes similar to some mode q in π by sim(π ), and the set of new modes q<sup>∗</sup> by new(π ). Once the path π is constructed, the adaptation of the lha <sup>H</sup> is performed with respect to <sup>π</sup> . Figure 4 exemplifies the adaptation of the lha in Fig. 1.

**Fig. 3.** Example describing the procedure RelaxPath(H, f, ε, π) for <sup>H</sup> given in Fig. 1, f = f<sup>2</sup> (depicted on the left), and path π = q1, q0, q1. The algorithm increases the invariant of mode q<sup>1</sup> by computing the convex hull of the old invariant [2, 2]<sup>ε</sup> and the set [1, 1]ε. Analogously, the guard of the transition (q1, q0) is increased.

**Definition 10.** *The* adaptation *of the* lha <sup>H</sup> = (*Q*,*E*, *<sup>X</sup>*, *Flow*,*Inv*, *Grd*) *with respect to an* m*-*pwl *function* f *with affine pieces* p1,...,p<sup>m</sup> *and a path* π = <sup>q</sup>1,...,q<sup>m</sup> *is the* lha <sup>H</sup> = (*<sup>Q</sup>* ,*E* , *X*, *Flow* ,*Inv* , *Grd* ) *defined as:*

$$\begin{aligned} & -E' := E \cup \mathsf{new}(\pi'), \\ & -E' := E \cup \{ (q\_j, q\_{j+1}) \mid 1 \le j < m \}, \\ & -\mathsf{Inv}'(q) := \begin{cases} \mathsf{new1} \mathsf{Low}(q) & \text{if } q \in \mathsf{new}(\pi'), \\ \mathsf{Flow}(q) & \text{otherwise}, \\ \mathsf{chunk}(q) & \text{otherwise}, \\ \mathsf{chunk}(1 \big{\big{}}\_{q=q\_j, q \neq q\_1} P\_{j-1}^{\pi'} \cup \bigcup\_{q=q\_j} P\_j^{\pi'}) & \text{if } q \in \mathsf{new}(\pi'), \\ \mathsf{chunk}(1 \big{}\_{q=q\_j, q \neq q\_1} P\_{j-1}^{\pi'} \cup \bigcup\_{q=q\_j} P\_j^{\pi'}) & \text{if } q \in \mathsf{sim}(\pi'), \\ \mathsf{Inv}(q) & \text{otherwise}, \\ \mathsf{Inv}(q) & \text{otherwise}, \\ \mathsf{c} & \text{otherwise}, \\ \mathsf{c} & \text{otherwise}, \end{cases} \\ & \text{otherwise}, \\ & \text{if } q \in \mathsf{new}(\pi') \\ & \text{otherwise}. \end{aligned}$$

If there is no path of length m in the graph of H, we choose a shorter path π in H of length m for the adaptation procedure. Then, for every position j m , we define the reachable switching set P <sup>π</sup> <sup>j</sup> as an empty set and proceed as usual.

#### **4.2 Discussion**

The construction of the initial lha (line 1 in Algorithm 1) can be modified to *clustering* pieces with *similar* slopes. This can help reducing the number of modes in the initial automaton, but does not guarantee that the first pwl function f<sup>0</sup> is ε-captured. To fix this, f<sup>0</sup> can be included in the loop of Algorithm 1.

Algorithm 1 follows a *local* repair strategy, based on a single pwl function. Thanks to this, the algorithm can be used in an online setting where new data arrives after the algorithm has started. However, the resulting model is influenced

**Fig. 4.** Example describing the procedure Adapt(H, f, π, ε) for the lha <sup>H</sup> in Fig. <sup>1</sup> with respect to the 3-pwl function f = f<sup>3</sup> and the path π = q1, q0, q<sup>1</sup> and ε = 0.25. The initial reachable switching set P <sup>π</sup> <sup>0</sup> is the projection of the set P on state x. Considering the flows in q<sup>1</sup> and q0, the next reachable switching set P <sup>π</sup> <sup>1</sup> is the projection of the set Q on state x. Observe that from Q, using the flow of q1, the reachable switching set P <sup>π</sup> <sup>2</sup> is empty. We thus add a new mode q<sup>∗</sup> and obtain the new path π = q1, q∗, q1.

by the order in which the algorithm processes the functions f ∈ F. In the simple case that F only contains affine functions with the same slope, all models resulting from different processing orders will consist of a single mode with the same flow, and the invariant bounds differ by at most ε. Furthermore, for a precision value ε = 0, the result is always order-independent.

We now discuss the restrictions of the models we obtain from Algorithm 1. We did not include a set of initial states in our presentation, but the generalization is straightforward. Our transitions do not include assignments, which would make executions discontinuous. The usual assumption in many application domains, e.g., life sciences, is that the underlying system is continuous, so having assignments would not be desirable. In the setting where the input is given as time-series data, discrete events would typically be approximated by steep slopes in the pwl function. In the setting where the input is given as discontinuous pwl functions f, in order to ε-capture f, one would generally require that the automaton switches synchronously with f (cf. Sect. 3.1), instead of asynchronous switching as in our algorithm. Under this additional assumption, we can pose the procedures Membership and RelaxPath as a single linear program (similar to formula φf,ε). This linear program can also be used to identify assignments.

The continuous dynamics of our models are defined by constant differential equations. As mentioned before, this class generally suffices to approximate an arbitrary continuous function (by increasing the number of modes). An extension of our approach to use polyhedral differential *inclusions* (also called linear envelopes) is by merging modes of "similar" dynamics. This may, however, lead to the dilemma that several modes are equally similar.

#### **4.3 Theoretical Properties of the Membership-based Synthesis**

The following theorem asserts that Algorithm 1 solves Problem 3.

**Theorem 2 (Soundness and precision).** *Given a finite set of* pwl *functions* <sup>F</sup> *and a value* <sup>ε</sup> <sup>∈</sup> <sup>R</sup>-<sup>0</sup>*, let* <sup>H</sup> *be an automaton resulting from* Synthesis(F, ε)*. Then* H *both* ε*-captures all functions in* F *and is* ε*-precise with respect to* F*.*

Algorithm 1 satisfies a completeness property in the following sense. For every model <sup>H</sup> from a certain class we can find a set <sup>F</sup> of pwl functions and a value <sup>ε</sup> such that Synthesis(F, ε) results in <sup>H</sup>. Before we can characterize the class of models, we first need to introduce some terminology.

**Definition 11.** *Let* q ∈ *Q be a mode with invariant* X = *Inv*(q) *and flow Flow*(q)*. We call a continuous state* **x**<sup>2</sup> ∈ X forward reachable in q *if there is a continuous state* **x**<sup>1</sup> ∈ X *such that* **x**<sup>2</sup> *is reachable from* **x**<sup>1</sup> *by just letting time pass, i.e.,* ∃t > 0 : **x**<sup>2</sup> = **x**<sup>1</sup> + *Flow*(q) · t*. Analogously, we call state* **x**<sup>2</sup> ∈ X backward reachable in q *if there is a state* **x**<sup>1</sup> ∈ X *such that* **x**<sup>2</sup> *is reachable from* **x**1*. A continuous state is* dead in q *if it is neither forward reachable nor backward reachable in* q*.*

We characterize the class of automata H = (*Q*,*E*, *X*, *Flow*,*Inv*, *Grd*) for which the algorithm is complete by considering the following assumptions: (1) no invariant contains a dead continuous state. Furthermore, if e = (q1, q2) is a transition, then all continuous states in the guard *Grd*(e) are forward reachable in q<sup>1</sup> and backward reachable in q2, and (2) no two modes have the same slope

Roughly speaking, Assumption (1) asserts that, after every switch, an execution can stay in the new mode for a positive amount of time.

**Theorem 3 (Completeness).** *Given an* lha <sup>H</sup> *satisfying Assumptions (1) and (2), there exist* pwl *functions* <sup>F</sup> *such that* Synthesis(F, 0) *results in* <sup>H</sup>*.*

#### **5 Experimental Results**

In this section, we present the experiments used to evaluate our algorithm. The algorithm was implemented in Python and relies on the standard scientific computation packages. For the computations involving polyhedra we used the pplpy wrapper to the Parma Polyhedra Library [4].

*Case Study: Online Synthesis.* We evaluate the precision of our algorithm by collecting data from the executions of existing linear hybrid automata. For each given automaton, we randomly sample ten executions and pass them to our algorithm, which then constructs a new model. After that, we run our algorithm with another 90 executions, but we reuse the intermediate model, thus demonstrating the online feature of the algorithm. We show the different models for two handcrafted examples in Table 1. We tried both sampling from random states and from a fixed state. The examples show the latter case, which makes sampling the complete state-space and thus learning a precise model harder.

The first example contains a sink with two incoming transitions, which requires at least two simulations to observe both transitions. Consequently, the algorithm had to make use of the *adaptation* step at least once to add one of the **Table 1.** Synthesis results for two automaton models. The original model is shown in blue. The synthesis result after 10 iterations is shown in bright red, and after another 90 iterations in dark red. On the bottom left we show three sample executions starting from the same point (top: original model, bottom: synthesized model after 100 iterations). We used ε = 0.2 in all cases. Numbers are rounded to two places.

transitions. In the second example, some parts of the state-space are explored less frequently by the sampled executions. Hence the first model obtained after ten iterations does not represent all behavior of the original model yet. After the additional 90 iterations, the remaining parts of the state space have been visited, which is reflected in the precise bounds of the resulting model. In the table, we also show three sample executions from both the original and the final synthesized automaton to illustrate the similarity in the dynamical behavior.

**Fig. 5.** Results for the cell model. Top: synthesized model using our algorithm. Bottom: three input traces (left) and random simulations of the synthesized model (right).

*Case Study: Cell Model.* For our case study we synthesize a hybrid automaton from voltage traces of excitable cells. Excitable cells are an important class of cells comprising neurons, cardiac cells, and other muscle cells. The main property of excitable cells is that they exhibit electrical activity which in the case of neurons enables signal transmission and in the case of muscle cells allows them to contract. The excitation signal usually follows distinct dynamics called action potential. Grosu et al. construct a *cyclic-linear hybrid automaton* from actionpotential traces of cardiac cells [8]. In their model they identify six modes, two of which exhibit the same dynamics and are just used to model an input signal.

Our algorithm successfully synthesizes a model, depicted in Fig. 5, consisting of five modes that roughly match the normal phases of an action potential. We evaluate the quality of the synthesized model by simulating random executions and visually comparing to the original data (see the bottom of Fig. 5).

#### **6 Conclusion**

In this paper we have presented two fully automatic approaches to synthesize a linear hybrid automaton from data. As key features, the synthesized automaton captures the data up to a user-defined bound and is tight. Moreover, the online feature of the membership-based approach allows to combine the approach with alternative synthesis techniques, e.g., for constructing initial models.

A future line of work is to design a methodology for identification of weak generalizations in the model, and use them for driving the experiments and, in consequence, adjusting the model. We would first synthesize a model as before, but then identify the aspects of the model that are least substantiated by the data (e.g., areas in the state space or specific sequences in the executions). Then we would query the system for data about those aspects, and repair the model accordingly. As another line of work, we plan to extend the approach to go from dynamics defined by piecewise-constant differential equations toward linear envelopes. Our approach can be seen as a generalization, to lha, of Angluin's algorithm for constructing a finite-state machine from finite traces [3], and we plan to pursue this connection further.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Overfitting in Synthesis: Theory and Practice

Saswat Padhi1(B), Todd Millstein<sup>1</sup>, Aditya Nori<sup>2</sup>, and Rahul Sharma<sup>3</sup>

<sup>1</sup> University of California, Los Angeles, USA {padhi,todd}@cs.ucla.edu <sup>2</sup> Microsoft Research, Cambridge, UK adityan@microsoft.com <sup>3</sup> Microsoft Research, Bengaluru, India rahsha@microsoft.com

Abstract. In syntax-guided synthesis (SyGuS), a synthesizer's goal is to automatically generate a program belonging to a grammar of possible implementations that meets a logical specification. We investigate a common limitation across state-of-the-art SyGuS tools that perform counterexample-guided inductive synthesis (CEGIS). We empirically observe that as the expressiveness of the provided grammar increases, the performance of these tools degrades significantly.

We claim that this degradation is not only due to a larger search space, but also due to *overfitting*. We formally define this phenomenon and prove *no-free-lunch* theorems for SyGuS, which reveal a fundamental tradeoff between synthesizer performance and grammar expressiveness.

A standard approach to mitigate overfitting in machine learning is to run multiple learners with varying expressiveness in parallel. We demonstrate that this insight can immediately benefit existing SyGuS tools. We also propose a novel single-threaded technique called *hybrid enumeration* that interleaves different grammars and outperforms the winner of the 2018 SyGuS competition (Inv track), solving more problems and achieving a 5× mean speedup.

#### 1 Introduction

The *syntax-guided synthesis* (SyGuS) framework [3] provides a unified format to describe a program synthesis problem by supplying (1) a logical specification for the desired functionality, and (2) a grammar of allowed implementations. Given these two inputs, a SyGuS tool searches through the programs that are permitted by the grammar to generate one that meets the specification. Today, SyGuS is at the core of several state-of-the-art program synthesizers [5,14,23,24,29], many of which compete annually in the SyGuS competition [1,4].

We demonstrate empirically that five state-of-the-art SyGuS tools are very sensitive to the choice of grammar. Increasing grammar expressiveness allows the tools to solve some problems that are unsolvable with less-expressive grammars. However, it also causes them to fail on many problems that the tools are able to solve with a less expressive grammar. We analyze the latter behavior both

S. Padhi—Contributed during an internship at Microsoft Research, India.

theoretically and empirically and present techniques that make existing tools much more robust in the face of increasing grammar expressiveness.

We restrict our investigation to a widely used approach [6] to SyGuS called *counterexample-guided inductive synthesis* (CEGIS) [37, §5]. In this approach, the synthesizer is composed of a learner and an oracle. The learner iteratively identifies a candidate program that is consistent with a given set of examples (initially empty) and queries the oracle to either prove that the program is *correct*, i.e., meets the given specification, or obtain a counterexample that demonstrates that the program does not meet the specification. The counterexample is added to the set of examples for the next iteration. The iterations continue until a correct program is found or resource/time budgets are exhausted.

*Overfitting.* To better understand the observed performance degradation, we instrumented one of these SyGuS tools (Sect. 2.2). We empirically observe that for a large number of problems, the performance degradation on increasing grammar expressiveness is often accompanied by a significant increase in the number of counterexamples required. Intuitively, as grammar expressiveness increases so does the number of *spurious* candidate programs, which satisfy a given set of examples but violate the specification. If the learner picks such a candidate, then the oracle generates a counterexample, the learner searches again, and so on.

In other words, increasing grammar expressiveness increases the chances for *overfitting*, a well-known phenomenon in machine learning (ML). Overfitting occurs when a learned function explains a given set of observations but does not generalize correctly beyond it. Since SyGuS is indeed a form of function learning, it is perhaps not surprising that it is prone to overfitting. However, we identify its specific source in the context of SyGuS—the spurious candidates induced by increasing grammar expressiveness—and show that it is a significant problem in practice. We formally define the *potential for overfitting* (Ω), in Definition 7, which captures the number of spurious candidates.

*No Free Lunch.* In the ML community, this tradeoff between expressiveness and overfitting has been formalized for various settings as *no-free-lunch* (NFL) theorems [34, §5.1]. Intuitively such a theorem says that for every learner there exists a function that cannot be efficiently learned, where efficiency is defined by the number of examples required. We have proven corresponding NFL theorems for the CEGIS-based SyGuS setting (Theorems 1 and 2).

A key difference between the ML and SyGuS settings is the notion of m*-learnability*. In the ML setting, the learned function may differ from the true function, as long as this difference (expressed as an error probability) is relatively small. However, because the learner is allowed to make errors, it is in turn required to learn given an arbitrary set of m examples (drawn from some distribution). In contrast, the SyGuS learning setting is *all-or-nothing*—either the tool synthesizes a program that meets the given specification or it fails. Therefore, it would be overly strong to require the learner to handle an arbitrary set of examples.

Instead, we define a much weaker notion of m-learnability for SyGuS, which only requires that there *exist* a set of m examples for which the learner succeeds. Yet, our NFL theorem shows that even this weak notion of learnability can always be thwarted: given an integer m ≥ 0 and an expressive enough (as a function of m) grammar, for every learner there exists a SyGuS problem that cannot be learned without access to more than m examples. We also prove that overfitting is inevitable with an expressive enough grammar (Theorems 3 and 4) and that the potential for overfitting increases with grammar expressiveness (Theorem 5).

*Mitigating Overfitting.* Inspired by *ensemble methods* [13] in ML, which aggregate results from multiple learners to combat overfitting (and underfitting), we propose PLearn—a black-box framework that runs multiple parallel instances of a SyGuS tool with different grammars. Although prior SyGuS tools run multiple instances of learners with different random seeds [7,20], to our knowledge, this is the first proposal to explore multiple grammars as a means to improve the performance of SyGuS. Our experiments indicate that PLearn significantly improves the performance of five state-of-the-art SyGuS tools—CVC4 [7,33], EUSolver [5], LoopInvGen [29], SketchAC [20,37], and Stoch [3, III F].

However, running parallel instances of a synthesizer is computationally expensive. Hence, we also devise a white-box approach, called *hybrid enumeration*, that extends the enumerative synthesis technique [2] to efficiently interleave exploration of multiple grammars in a single SyGuS instance. We implement hybrid enumeration within LoopInvGen<sup>1</sup> and show that the resulting singlethreaded learner, LoopInvGen+HE, has negligible overhead but achieves performance comparable to that of PLearn for LoopInvGen. Moreover, LoopInvGen+HE significantly outperforms the winner [28] of the invariant-synthesis (Inv) track of 2018 SyGuS competition [4]—a variant of LoopInvGen specifically tuned for the competition—including a 5× mean speedup and solving two SyGuS problems that no tool in the competition could solve.

*Contributions.* In summary, we present the following contributions:


<sup>1</sup> Our implementation is available at https://github.com/SaswatPadhi/LoopInvGen.

#### 2 Motivation

In this section, we first present empirical evidence that existing SyGuS tools are sensitive to changes in grammar expressiveness. Specifically, we demonstrate that as we increase the expressiveness of the provided grammar, every tool starts failing on some benchmarks that it was able to solve with less-expressive grammars. We then investigate one of these tools in detail.

#### 2.1 Grammar Sensitivity of SyGuS Tools

We evaluated 5 state-of-the-art SyGuS tools that use very different techniques:


We ran these five tools on 180 invariant-synthesis benchmarks, which we describe in Sect. 5. We ran the benchmarks with each of the six grammars of quantifier-free predicates, which are shown in Fig. 1. These grammars correspond to widely used abstract domains in the analysis of integer-manipulating programs— Equalities, Intervals [11], Octagons [25], Polyhedra [12], algebraic expressions (Polynomials) and arbitrary integer arithmetic (Peano) [30]. The \*<sup>S</sup> operator denotes scalar multiplication, e.g., (\*<sup>S</sup> 2 x), and \*<sup>N</sup> denotes nonlinear multiplication, e.g., (\*<sup>N</sup> x y).

In Fig. 2, we report our findings on running each benchmark on each tool with each grammar, with a 30 minute wall-clock timeout. For each tool, grammar pair, the y-axis shows

$$\begin{array}{c|c|c|c|c|c|c} \langle \mathbf{b} \rangle = \text{true} & \text{false} & \text{false} & \langle \text{Boxs} & \text{variables} \rangle \\ & & & \langle \text{Boxs} & \text{do} & \text{do} & \text{do} & \text{do} \\ & & & \langle \text{Boxs} & \text{do} & \text{do} & \text{do} & \text{do} \\ & & & \langle \text{And} & \text{do} & \text{do} & \text{do} & \text{do} \\ & & & \langle \text{And} & \text{do} & \text{do} & \text{do} & \text{do} \\ \end{array}$$


Fig. 1. Grammars of quantifier-free predicates over integers (We use the |= <sup>+</sup> operator to append new rules to previously defined nonterminals.)

the number of failing benchmarks that the same tool is able to solve with a lessexpressive grammar. We observe that, for each tool, the number of such failures increases with the grammar expressiveness. For instance, introducing the scalar multiplication operator (\*<sup>S</sup>) causes CVC4 to fail on <sup>21</sup> benchmarks that it is able to solve with Equalities (4/21), Intervals (18/21), or Octagons (10/21). Similarly, adding nonlinear multiplication causes LoopInvGen to fail on 10 benchmarks that it can solve with a less-expressive grammar.

Fig. 2. For each grammar, each tool, the ordinate shows the number of benchmarks that *fail* with the grammar but are solvable with a less-expressive grammar.


Fig. 3. Observed correlation between synthesis time and number of rounds, upon increasing grammar expressiveness, with LoopInvGen [29] on <sup>180</sup> benchmarks

#### 2.2 Evidence for Overfitting

To better understand this phenomenon, we instrumented LoopInvGen [29] to record the candidate expressions that it synthesizes and the number of CEGIS iterations (called *rounds* henceforth). We compare each pair of successful runs of each of our 180 benchmarks on distinct grammars.<sup>2</sup> In 65 % of such pairs, we observe performance degradation with the more expressive grammar. We also report the correlation between performance degradation and number of rounds for the more expressive grammar in each pair in Fig. 3.

In 67 % of the cases with degraded performance upon increased grammar expressiveness, the number of rounds remains unaffected—indicating that this slowdown is mainly due to a larger search space. However, there is significant evidence of performance degradation due to *overfitting* as well. We note an increase in the number of rounds for 27 % of the cases with degraded performance. Moreover, we notice performance degradation in 79 % of all cases that required more rounds on increasing grammar expressiveness.

Thus, a more expressive grammar not only increases the search space, but also makes it more likely for LoopInvGen to overfit—select a spurious expression, which the oracle rejects with a counterexample, hence requiring more rounds. In the remainder of this section, we demonstrate this overfitting phenomenon on the verification problem shown in Fig. 4, an example by Gulwani and Jojic [17], which is the fib\_19 benchmark in the Inv track of SyGuS-Comp 2018 [4].

<sup>2</sup> We ignore failing runs since they require an unknown number of rounds.

For Fig. 4, we require an inductive invariant that is strong enough to prove that the assertion on line 6 always holds. In the SyGuS setting, we need to synthesize a predicate <sup>I</sup> : <sup>Z</sup><sup>4</sup> <sup>→</sup> <sup>B</sup> defined on a symbolic state σ = m, n, x, y, that satisfies ∀σ : ϕ(I, σ) for the specification ϕ: 3

### Fig. 4. The fib\_19 benchmark [17]

$$\begin{array}{rcl} \varphi(\mathcal{I},\sigma) \stackrel{\mathsf{def}}{=} \{0 \le n \land 0 \le m \le n \land x = 0 \land y = m\} & \implies \mathcal{I}(\sigma) & \text{(precondition)}\\ \land \forall \sigma' \colon \left(\mathcal{I}(\sigma) \land T(\sigma, \sigma')\right) & \implies \mathcal{I}(\sigma') & \text{(inductiveness)}\\ \land \left(x \ge n \land \mathcal{I}(\sigma)\right) & \implies y = n & \text{(post condition)} \end{array}$$

where σ = m , n , x , y denotes the new state after one iteration, and T is a transition relation that describes the loop body:

$$\begin{array}{rcl} T(\sigma, \sigma') \stackrel{\mathsf{def}}{=} (x < n) \land (x' = x + 1) \land (m' = m) \land (n' = n) \\ \land \left[ (x' \le m \land y' = y) \lor (x' > m \land y' = y + 1) \right] \end{array}$$


$$
\omega\_{\alpha\alpha} = \epsilon\_{\alpha\beta} = \dots = \omega\_{\lambda\alpha} \epsilon\_{\alpha\beta} = \dots = \dots = \omega\_{\lambda\alpha} \epsilon\_{\alpha\beta} = \dots = \omega\_{\lambda\alpha} \epsilon\_{\alpha\beta} = \dots = \omega\_{\lambda\alpha} \epsilon\_{\alpha\beta} = \dots
$$

$$\mathtt{57} \colon (m = y) \lor (x \ge m \land x \ge y)$$

Fig. 5. Performance of LoopInvGen [29] on the fib\_19 benchmark (Fig. 4). In (b) and (c), we show predicates generated at various rounds (numbered in bold).

In Fig. 5(a), we report the performance of LoopInvGen on fib\_19 (Fig. 4) with our six grammars (Fig. 1). It succeeds with all but the least-expressive grammar. However, as grammar expressiveness increases, the number of rounds increase significantly—from 19 rounds with Intervals to 88 rounds with Peano.

LoopInvGen converges to the *exact same* invariant with both Polyhedra and Peano but requires 30 more rounds in the latter case. In Figs. 5(b) and (c), we list some expressions synthesized with Polyhedra and Peano respectively. These expressions are solutions to intermediate subproblems—the final loop invariant is a conjunction of a subset of these expressions [29, §3.2]. Observe that the expressions generated with the Peano grammar are quite complex and unlikely to generalize well. Peano's extra expressiveness leads to more spurious candidates, increasing the chances of overfitting and making the benchmark harder to solve.

<sup>3</sup> We use B, N, and Z to denote the sets of all Boolean values, all natural numbers (positive integers), and all integers respectively.

#### 3 SyGuS Overfitting in Theory

In this section, first we formalize the *counterexample-guided inductive synthesis* (CEGIS) approach [37] to SyGuS, in which examples are iteratively provided by a verification oracle. We then state and prove *no-free-lunch* theorems, which show that there can be no optimal learner for this learning scheme. Finally, we formalize a natural notion of *overfitting* for SyGuS and prove that the potential for overfitting increases with grammar expressiveness.

#### 3.1 Preliminaries

We borrow the formal definition of a SyGuS problem from prior work [3]:

Definition 1 (SyGuS Problem). *Given a background theory* T*, a function symbol* f : X → Y *, and constraints on* f*: (1) a semantic constraint, also called a* specification*,* φ(f,x) *over the vocabulary of* T *along with* f *and a symbolic input* x*, and (2) a syntactic constraint, also called a* grammar*, given by a (possibly infinite) set* <sup>E</sup> *of expressions over the vocabulary of the theory* <sup>T</sup>*; find an expression* <sup>e</sup> <sup>∈</sup> <sup>E</sup> *such that the formula* <sup>∀</sup><sup>x</sup> <sup>∈</sup> <sup>X</sup> : <sup>φ</sup>(e, x) *is valid modulo* <sup>T</sup>*.*

*We denote this SyGuS problem as* f<sup>X</sup>→<sup>Y</sup> | φ, E <sup>T</sup> *and say that it is* satisfiable *iff there exists such an expression* e*, i.e.,* ∃ e ∈ E : ∀x ∈ X : φ(e, x)*. We call* e *a* satisfying expression *for this problem, denoted as* e |= f<sup>X</sup>→<sup>Y</sup> | φ, E <sup>T</sup>*.*

Recall, we focus on a common class of SyGuS learners, namely those that learn from examples. First we define the notion of input-output (IO) examples that are consistent with a SyGuS specification:

Definition 2 (Input-Output Example). *Given a specification* φ *defined on* <sup>f</sup> : <sup>X</sup> <sup>→</sup> <sup>Y</sup> *over a background theory* <sup>T</sup>*, we call a pair* x,y <sup>∈</sup> <sup>X</sup> <sup>×</sup> <sup>Y</sup> *an inputoutput (IO) example for* <sup>φ</sup>*, denoted as* x,y -≈<sup>T</sup> φ *iff it is satisfied by some valid interpretation of* f *within* T*, i.e.,*

> x,y -<sup>≈</sup><sup>T</sup> <sup>φ</sup> def <sup>=</sup> <sup>∃</sup> <sup>e</sup><sup>∗</sup> <sup>∈</sup> <sup>T</sup>: <sup>e</sup>∗(x) = <sup>y</sup> ∧ - ∀x ∈ X : φ(e∗, x)

The next two definitions respectively formalize the two key components of a CEGIS-based SyGuS tool: the verification oracle and the learner.

Definition 3 (Verification Oracle). *Given a specification* φ *defined on a function* <sup>f</sup> : <sup>X</sup> <sup>→</sup> <sup>Y</sup> *over theory* <sup>T</sup>*, a verification oracle* <sup>O</sup><sup>φ</sup> *is a partial function that given an expression* e*, either returns* ⊥ *indicating* ∀x ∈ X : φ(e, x) *holds, or gives a counterexample* x, y *against* e*, denoted as* e -<sup>×</sup><sup>φ</sup> x,y*, such that*

$$e \leadsto\_{\phi} \langle x, y \rangle \stackrel{\mathsf{def}}{=} \neg \phi(e, x) \land e(x) \neq y \land \langle x, y \rangle \models\_{\top} \phi$$

We omit φ from the notations O<sup>φ</sup> and -×<sup>φ</sup> when it is clear from the context. Definition 4 (CEGIS-based Learner). *A CEGIS-based learner* L<sup>O</sup>(q, E) *is a partial function that given an integer* q ≥ 0*, a set* E *of expressions, and access to an oracle* O *for a specification* φ *defined on* f : X → Y *, queries* O *at most* q *times and either fails with* ⊥ *or generates an expression* e ∈ E*. The* trace

 e<sup>0</sup> -<sup>×</sup> x0, y0, ..., ep−<sup>1</sup> -<sup>×</sup> xp−1, yp−1, e<sup>p</sup> where 0 ≤ p ≤ q

*summarizes the interaction between the oracle and the learner. Each* e<sup>i</sup> *denotes the* i *th candidate for* <sup>f</sup> *and* xi, yi *is a counterexample* <sup>e</sup>i*, i.e.,*

$$\left(\forall j < i \colon e\_i(x\_j) = y\_j \land \phi(e\_i, x\_j)\right) \land \left(e\_i \leadsto\_{\phi} \langle x\_i, y\_i \rangle\right)$$

Note that we have defined oracles and learners as (partial) functions, and hence as *deterministic*. In practice, many SyGuS tools are deterministic and this assumption simplifies the subsequent theorems. However, we expect that these theorems can be appropriately generalized to randomized oracles and learners.

#### 3.2 Learnability and No Free Lunch

In the machine learning (ML) community, the limits of learning have been formalized for various settings as *no-free-lunch* theorems [34, §5.1]. Here, we provide a natural form of such theorems for CEGIS-based SyGuS learning.

In SyGuS, the learned function must conform to the given grammar, which may not be fully expressive. Therefore we first formalize grammar expressiveness:

Definition 5 (*k*-Expressiveness). *Given a domain* X *and range* Y *, a grammar* E *is said to be* k*-expressive iff* E *can express exactly* k *distinct* X → Y *functions.*

A key difference from the ML setting is our notion of m*-learnability*, which formalizes the number of examples that a learner requires in order to learn a desired function. In the ML setting, a function is considered to m-learnable by a learner if it can be learned using an *arbitrary* set of m i.i.d. examples (drawn from some distribution). This makes sense in the ML setting since the learned function is allowed to make errors (up to some given bound on the error probability), but it is much too strong for the *all-or-nothing* SyGuS setting.

Instead, we define a much weaker notion of m-learnability for CEGIS-based SyGuS, which only requires that there *exist* a set of m examples that allows the learner to succeed. The following definition formalizes this notion.

Definition 6 (CEGIS-based *m*-Learnability). *Given a SyGuS problem S* = f<sup>X</sup>→<sup>Y</sup> | φ, E <sup>T</sup> *and an integer* m ≥ 0*, we say that S is* m*-learnable by a CEGISbased learner* L *iff there exists a verification oracle* O *under which* L *can learn a satisfying expression for S with at most* m *queries to* O*, i.e.,* ∃ O : L<sup>O</sup>(m, E) |= *S.*

Finally we state and prove the no-free-lunch (NFL) theorems, which make explicit the tradeoff between grammar expressiveness and learnability. Intuitively, given an integer m and an expressive enough (as a function of m) grammar, for every learner there exists a SyGuS problem that cannot be solved without access to at least m + 1 examples. This is true despite our weak notion of learnability.

Put another way, as grammar expressiveness increases, so does the number of examples required for learning. On one extreme, if the given grammar is 1-expressive, i.e., can express exactly one function, then all satisfiable SyGuS problems are 0-learnable—no examples are needed because there is only one function to learn—but there are *many* SyGuS problems that cannot be satisfied by this function. On the other extreme, if the grammar is |Y | |X| -expressive, i.e., can express all functions from X to Y , then for every learner there exists a SyGuS problem that requires *all* |X| *examples* in order to be solved.

Below we first present the NFL theorem for the case when the domain X and range Y are finite. We then generalize to the case when these sets may be countably infinite. We provide the proofs of these theorems in the extended version of this paper [27, Appendix A.1].

Theorem 1 (NFL in CEGIS-based SyGuS on Finite Sets). *Let* X *and* Y *be two arbitrary finite sets,* <sup>T</sup> *be a theory that supports equality,* <sup>E</sup> *be a grammar over* <sup>T</sup>*, and* <sup>m</sup> *be an integer such that* <sup>0</sup> <sup>≤</sup> m < <sup>|</sup>X|*. Then, either:*


Theorem 2 (NFL in CEGIS-based SyGuS on Countably Infinite Sets). *Let* X *be an arbitrary countably infinite set,* Y *be an arbitrary finite or countably infinite set,* <sup>T</sup> *be a theory that supports equality,* <sup>E</sup> *be a grammar over* <sup>T</sup>*, and* <sup>m</sup> *be an integer such that* m ≥ 0*. Then, either:*


#### 3.3 Overfitting

Last, we relate the above theory to the notion of *overfitting* from ML. In the context of SyGuS, overfitting can potentially occur whenever there are multiple candidate expressions that are consistent with a given set of examples. Some of these expressions may not generalize to satisfy the specification, but the learner has no way to distinguish among them (using just the given set of examples) and so can "guess" incorrectly. We formalize this idea through the following measure:

Definition 7 (Potential for Overfitting). *Given a problem S* = f<sup>X</sup>→<sup>Y</sup> | φ, E <sup>T</sup> *and a set* Z *of IO examples for* φ*, we define the potential for overfitting* Ω *as the number of expressions in* E *that are consistent with* Z *but do not satisfy S, i.e.,*

$$\Omega(\mathcal{S}, Z) \stackrel{\text{def}}{=} \begin{cases} \left| \{ e \in \mathcal{E} \: \mid \: e \not\models \mathcal{S} \land \; \forall \langle x, y \rangle \in Z \colon e(x) = y \} \right| & \forall z \in Z \colon z \not\models \varphi \\\bot & \text{(undefined)} & \text{otherwise} \end{cases}$$

Intuitively, a zero potential for overfitting means that overfitting is not possible on the given problem with respect to the given set of examples, because there is no spurious candidate. A positive potential for overfitting means that overfitting is possible, and higher values imply more spurious candidates and hence more potential for a learner to choose the "wrong" expression.

The following theorems connect our notion of overfitting to the earlier NFL theorems by showing that overfitting is inevitable with an expressive enough grammar. The proofs of these theorems can be found in the extended version of this paper [27, Appendix A.2].

Theorem 3 (Overfitting in SyGuS on Finite Sets). *Let* X *and* Y *be two arbitrary finite sets,* <sup>m</sup> *be an integer such that* <sup>0</sup> <sup>≤</sup> m < <sup>|</sup>X|*,* <sup>T</sup> *be a theory that supports equality, and* <sup>E</sup> *be a* <sup>k</sup>*-expressive grammar over* <sup>T</sup> *for some* k > *|X|***!** *|Y | m <sup>m</sup>***!** (*|X| − <sup>m</sup>*)**!***. Then, there exists a satisfiable SyGuS problem <sup>S</sup>* <sup>=</sup> f<sup>X</sup>→<sup>Y</sup> <sup>|</sup> φ, E <sup>T</sup> *such that* Ω(*S*, Z) > 0*, for every set* Z *of* m *IO examples for* φ*.*

Theorem 4 (Overfitting in SyGuS on Countably Infinite Sets). *Let* X *be an arbitrary countably infinite set,* Y *be an arbitrary finite or countably infinite set,* <sup>T</sup> *be a theory that supports equality, and* <sup>E</sup> *be a* <sup>k</sup>*-expressive grammar over* <sup>T</sup> *for some* k > ℵ0*. Then, there exists a satisfiable SyGuS problem S* = f<sup>X</sup>→<sup>Y</sup> | φ, E <sup>T</sup> *such that* Ω(*S*, Z) > 0*, for every set* Z *of* m *IO examples for* φ*.*

Finally, it is straightforward to show that as the expressiveness of the grammar provided in a SyGuS problem increases, so does its potential for overfitting.

Theorem 5 (Overfitting Increases with Expressiveness). *Let* X *and* Y *be two arbitrary sets,* <sup>T</sup> *be an arbitrary theory,* <sup>E</sup><sup>1</sup> *and* <sup>E</sup><sup>2</sup> *be grammars over* <sup>T</sup> *such that* <sup>E</sup><sup>1</sup> <sup>⊆</sup> <sup>E</sup>2*,* <sup>φ</sup> *be an arbitrary specification over* <sup>T</sup> *and a function symbol* f : X → Y *, and* Z *be a set of IO examples for* φ*. Then, we have*

$$\Omega(\left\_{\mathbb{T}},Z)\leq\Omega(\left\_{\mathbb{T}},Z).$$

#### 4 Mitigating Overfitting

*Ensemble methods* [13] in machine learning (ML) are a standard approach to reduce overfitting. These methods aggregate predictions from several learners to make a more accurate prediction. In this section we propose two approaches, inspired by ensemble methods in ML, for mitigating overfitting in SyGuS. Both are based on the key insight from Sect. 3.3 that synthesis over a subgrammar has a smaller potential for overfitting as compared to that over the original grammar.

#### 4.1 Parallel SyGuS on Multiple Grammars

Our first idea is to run multiple parallel instances of a synthesizer on the same SyGuS problem but with grammars of varying expressiveness. This framework, called PLearn, is outlined in Algorithm 1. It accepts a synthesis tool <sup>T</sup> , a SyGuS


 func PLearn (<sup>T</sup> : Synthesis Tool, f<sup>X</sup>→<sup>Y</sup> <sup>|</sup> φ, <sup>E</sup> <sup>T</sup> : Problem, <sup>E</sup>1*...p* : Subgrammars) 2 - Requires: ∀ E<sup>i</sup> ∈ E1...p : E<sup>i</sup> ⊆ E parallel for <sup>i</sup> <sup>←</sup> <sup>1</sup>,...,p do S*<sup>i</sup>* ← f<sup>X</sup>→<sup>Y</sup> | φ, E*<sup>i</sup>* <sup>T</sup> e*<sup>i</sup>* ← T (S*i*) if <sup>e</sup>*<sup>i</sup>* <sup>=</sup> <sup>⊥</sup> then return <sup>e</sup>*<sup>i</sup>* <sup>7</sup> return <sup>⊥</sup>

problem f<sup>X</sup>→<sup>Y</sup> | φ, E <sup>T</sup>, and subgrammars E1...p, <sup>4</sup> such that <sup>E</sup><sup>i</sup> <sup>⊆</sup> <sup>E</sup>. The parallel for construct creates a new thread for each iteration. The loop in PLearn creates p copies of the SyGuS problem, each with a different grammar from E1...p, and dispatches each copy to a new instance of the tool <sup>T</sup> . PLearn returns the first solution found or ⊥ if none of the synthesizer instances succeed.

Since each grammar in E1...p is subsumed by the original grammar E, any expression found by PLearn is a solution to the original SyGuS problem. Moreover, from Theorem 5 it is immediate that PLearn indeed reduces overfitting.

Theorem 6 (PLearn Reduces Overfitting). *Given a SyGuS problem S* = f<sup>X</sup>→<sup>Y</sup> <sup>|</sup> φ, E <sup>T</sup>*, if* PLearn *is instantiated with <sup>S</sup> and subgrammars* <sup>E</sup>1...p *such that* ∀ E<sup>i</sup> <sup>∈</sup> <sup>E</sup>1...p : <sup>E</sup><sup>i</sup> <sup>⊆</sup> <sup>E</sup>*, then for each <sup>S</sup>*<sup>i</sup> <sup>=</sup> f<sup>X</sup>→<sup>Y</sup> <sup>|</sup> φ, <sup>E</sup>i <sup>T</sup> *constructed by* PLearn*, we have that* Ω(*S*i, Z) ≤ Ω(*S*, Z) *on any set* Z *of IO examples for* φ*.*

A key advantage of PLearn is that it is agnostic to the synthesizer's implementation. Therefore, existing SyGuS learners can immediately benefit from PLearn, as we demonstrate in Sect. 5.1. However, running p parallel SyGuS instances can be prohibitively expensive, both computationally and memorywise. The problem is worsened by the fact that many existing SyGuS tools already use multiple threads, e.g., the SketchAC [20] tool spawns 9 threads. This motivates our *hybrid enumeration* technique described next, which is a novel synthesis algorithm that interleaves exploration of multiple grammars in a single thread.

#### 4.2 Hybrid Enumeration

Hybrid enumeration extends the *enumerative synthesis* technique, which enumerates expressions within a given grammar in order of size and returns the first candidate that satisfies the given examples [2]. Our goal is to simulate the behavior of PLearn with an enumerative synthesizer in a single thread. However, a straightforward interleaving of multiple PLearn threads would be highly inefficient because of redundancies – enumerating the same expression (which is contained in multiple grammars) multiple times. Instead, we propose a technique that (1) enumerates each expression at most once, and (2) reuses previously enumerated expressions to construct larger expressions.

<sup>4</sup> We use the shorthand X1*,...,n* to denote the sequence X1,..., <sup>X</sup>*<sup>n</sup>* .

To achieve this, we extend a widely used [2,15,31] synthesis strategy, called *component-based synthesis* [21], wherein the grammar of expressions is induced by a set of components, each of which is a typed operator with a fixed arity. For example, the grammars shown in Fig. 1 are induced by integer components (such as <sup>1</sup>, <sup>+</sup>, mod, <sup>=</sup>, etc.) and Boolean components (such as true, and, or, etc.). Below, we first formalize the grammar that is implicit in this synthesis style.

Definition 8 (Component-Based Grammar). *Given a set C of typed components, we define the* component-based *grammar* E *as the set of all expressions formed by well-typed component application over C , i.e.,*

$$\mathcal{E} = \{c(e\_1, \dots, e\_a) \mid (c: \tau\_1 \times \dots \times \tau\_a \to \tau) \in \emptyset \land e\_{1\dots a} \subset \mathcal{E}\}$$

$$\bigwedge e\_1: \tau\_1 \land \dots \land e\_a: \tau\_a \nmid$$

*where* e : τ *denotes that the expression* e *has type* τ *.*

We denote the set of all components appearing in a component-based grammar E as components(E). Henceforth, we assume that components(E) is known (explicitly provided by the user) for each E. We also use values(E) to denote the subset of nullary components (variables and constants) in components(E), and operators(E) to denote the remaining components with positive arities.

The closure property of component-based grammars significantly reduces the overhead of tracking which subexpressions can be combined together to form larger expressions. Given a SyGuS problem over a grammar E, hybrid enumeration requires a sequence E1...p of grammars such that each E<sup>i</sup> is a componentbased grammar and that E<sup>1</sup> ⊂ ··· ⊂ E<sup>p</sup> ⊆ E. Next, we explain how the subset relationship between the grammars enables efficient enumeration of expressions.

Given grammars E<sup>1</sup> ⊂ ··· ⊂ Ep, observe that an expression of size k in E<sup>i</sup> may only contain subexpressions of size {1,...,(k − 1)} belonging to E1...i. This allows us to enumerate expressions in an order such that each subexpression e is synthesized (and cached) before any expressions that have e as a subexpression. We call an enumeration order that ensures this property a *well order*.

Definition 9 (Well Order). *Given arbitrary grammars* E1...p*, we say that a strict partial order on* <sup>E</sup>1...p <sup>×</sup> <sup>N</sup> *is a well order iff*

$$\forall \mathcal{E}\_a, \mathcal{E}\_b \in \mathcal{E}\_{1\dots p}: \ \forall k\_1, k\_2 \in \mathbb{N}: \ [\mathcal{E}\_a \subseteq \mathcal{E}\_b \land k\_1 < k\_2] \implies (\mathcal{E}\_a, k\_1) \lhd (\mathcal{E}\_b, k\_2)$$

Motivated by Theorem 5, our implementation of hybrid enumeration uses a particular well order that incrementally increases the expressiveness of the space of expressions. For a rough measure of the expressiveness (Definition 5) of a pair (E, k), i.e., the set of expressions of size k in a given grammar E, we simply overapproximate the number of syntactically distinct expressions:

Theorem 7. *Let* E1...p *be component-based grammars and C*<sup>i</sup> = components(Ei)*. Then, the following strict partial order* -<sup>∗</sup> *on* <sup>E</sup>1...p <sup>×</sup> <sup>N</sup> *is a well order*

$$\forall \mathcal{E}\_a, \mathcal{E}\_b \in \mathcal{E}\_{1\dots p}: \ \forall m, n \in \mathbb{N}: \ (\mathcal{E}\_a, m) \preccurlyeq\_\* (\mathcal{E}\_b, n) \iff |\mathcal{C}\_a|^m < |\mathcal{C}\_b|^n$$

We now describe the main hybrid enumeration algorithm, which is listed in Algorithm 2. The HEnum function accepts a SyGuS problem f<sup>X</sup>→<sup>Y</sup> <sup>|</sup> φ, E <sup>T</sup>, a set E1...p of component-based grammars such that E<sup>1</sup> ⊂ ··· ⊂ E<sup>p</sup> ⊆ E, a well order -, and an upper bound q ≥ 0 on the size of expressions to enumerate. In lines 4 – 8, we first enumerate all values and cache them as expressions of size one. In general C[j, k][τ ] contains expressions of type τ and size k from E<sup>j</sup> \ Ej−1. In line 9 we sort (grammar, size) pairs in some total order consistent with -. Finally, in lines 10 – 20, we iterate over each pair (E<sup>j</sup> , k) and each operator from E1...j and invoke the Divide procedure (Algorithm 3) to carefully choose the operator's argument subexpressions ensuring (1) *correctness* – their sizes sum up to k − 1, (2) *efficiency* – expressions are enumerated at most once, and (3) *completeness* – all expressions of size k in E<sup>j</sup> are enumerated.

The Divide algorithm generates a set of locations for selecting arguments to an operator. Each location is a pair (x, y) indicating that any expression from C[x, y][τ ] can be an argument, where τ is the argument type required by the operator. Divide accepts an arity a for an operator o, a size budget q, the index l of the least-expressive grammar containing o, the index j of the leastexpressive grammar that should contain the constructed expressions of the form o(e1,...,ea), and an accumulator α that stores the list of argument locations. In lines 7 – 9, the size budget is recursively divided among a − 1 locations. In each recursive step, the upper bound (q −a+ 1) on v ensures that we have a size budget of at least q − (q − a + 1) = a − 1 for the remaining a − 1 locations. This results in a call tree such that the accumulator α at each leaf node contains the locations from which to select the last a−1 arguments, and we are left with some size budget q ≥ 1 for the first argument e1. Finally in lines 4 – 5, we carefully select the locations for e<sup>1</sup> to ensure that o(e1,...,ea) has not been synthesized before—either o ∈ components(E<sup>j</sup> ) or at least one argument belongs to E<sup>j</sup> \E<sup>j</sup>−<sup>1</sup>. 5

We conclude by stating some desirable properties satisfied by HEnum. Their proofs are provided in the extended version of this paper [27, Appendix A.3].

Theorem 8 (HEnum is Complete up to Size *q*). *Given a SyGuS problem <sup>S</sup>* <sup>=</sup> f<sup>X</sup>→<sup>Y</sup> <sup>|</sup> φ, E <sup>T</sup>*, let* <sup>E</sup>1...p *be component-based grammars over theory* <sup>T</sup> *such that* <sup>E</sup><sup>1</sup> <sup>⊂</sup> ··· <sup>⊂</sup> <sup>E</sup><sup>p</sup> <sup>=</sup> <sup>E</sup>*, be a well order on* <sup>E</sup>1...p <sup>×</sup> <sup>N</sup>*, and* <sup>q</sup> <sup>≥</sup> <sup>0</sup> *be an upper bound on size of expressions. Then,* HEnum(*S*, <sup>E</sup>1...p, -, q) *will eventually find a satisfying expression if there exists one with size* ≤ q*.*

Theorem 9 (HEnum is Efficient). *Given a SyGuS problem S* = f<sup>X</sup>→<sup>Y</sup> <sup>|</sup> φ, E <sup>T</sup>*, let* <sup>E</sup>1...p *be component-based grammars over theory* <sup>T</sup> *such that* <sup>E</sup><sup>1</sup> <sup>⊂</sup> ··· <sup>⊂</sup> <sup>E</sup><sup>p</sup> <sup>⊆</sup> <sup>E</sup>*, be a well order on* <sup>E</sup>1...p <sup>×</sup> <sup>N</sup>*, and* <sup>q</sup> <sup>≥</sup> <sup>0</sup> *be an upper bound on size of expressions. Then,* HEnum(*S*, <sup>E</sup>1...p, -, q) *will enumerate each distinct expression at most once.*

<sup>5</sup> We use as the cons operator for sequences, e.g., <sup>x</sup> y, z <sup>=</sup> x, y, z .

Algorithm 2. *Hybrid enumeration* to combat overfitting in SyGuS


Algorithm 3. An algorithm to divide a given size budget among subexpressions <sup>5</sup>

```
1 func Divide (a: Arity, q : Size, l: Op. Level, j : Expr. Level, α: Accumulated Args.)
2 -
    Requires: 1 ≤ a ≤ q ∧ l ≤ j
3 if a = 1 then
4 if l = j ∨ ∃x, y	 ∈ α: x = j then return 
                                                  (1, q) 
 α, . . . , (j, q) 
 α

5 return 
                  (j, q) 
 α

6 L = {}
7 for u ← 1 to j do
8 for v ← 1 to (q − a + 1) do
9 L ← L ∪ Divide(a − 1, q − v, l, j, (u, v) 
 α)
10 return L
```
#### 5 Experimental Evaluation

In this section we empirically evaluate PLearn and HEnum. Our evaluation uses a set of 180 synthesis benchmarks,<sup>6</sup> consisting of all 127 official benchmarks from the Inv track of 2018 SyGuS competition [4] augmented with benchmarks from the 2018 Software Verification competition (SV-Comp) [8] and challenging verification problems proposed in prior work [9,10]. All these synthesis tasks are

All benchmarks are available at https://github.com/SaswatPadhi/LoopInvGen.

defined over integer and Boolean values, and we evaluate them with the six grammars described in Fig. 1. We have omitted benchmarks from other tracks of the SyGuS competition as they either require us to construct E1...p (Sect. 4) by hand or lack verification oracles. All our experiments use an 8-core Intel ® Xeon ® E5 machine clocked at 2.30 GHz with 32 GB memory running Ubuntu ® 18.04.

### 5.1 Robustness of PLearn

For five state-of-the-art SyGuS solvers – (a) LoopInvGen [29], (b) CVC4 [7,33], (c) Stoch [3, III F], (d) SketchAC [8,20], and (e) EUSolver [5] – we have compared the performance across various grammars, with and without the PLearn framework (Algorithm 1). In this framework, to solve a SyGuS problem with the pth expressiveness level from our six integer-arithmetic grammars (see Fig. 1), we run p independent parallel instances of a SyGuS tool, each with one of the first p grammars. For example, to solve a SyGuS problem with the Polyhedra grammar, we run four instances of a solver with the Equalities, Intervals, Octagons and Polyhedra grammars. We evaluate these runs for each tool, for each of the 180 benchmarks and for each of the six expressiveness levels.

Fig. 6. The number of failures on increasing grammar expressiveness, for state-of-theart SyGuS tools, with and without the PLearn framework (Algorithm 1)

Figure 6 summarizes our findings. Without PLearn the number of failures initially decreases and then increases across all solvers, as grammar expressiveness increases. However, with PLearn the tools incur fewer failures at a given level of expressiveness, and there is a trend of *decreased* failures with increased expressiveness. Thus, we have demonstrated that PLearn is an effective measure to mitigate overfitting in SyGuS tools and significantly improve their performance.

#### 5.2 Performance of Hybrid Enumeration

To evaluate the performance of hybrid enumeration, we augment an existing synthesis engine with HEnum (Algorithm 2). We modify our LoopInvGen tool [29], which is the best-performing SyGuS synthesizer from Fig. 6. Internally, Loop-InvGen leverages Escher [2], an enumerative synthesizer, which we replace with HEnum. We make no other changes to LoopInvGen. We evaluate the performance and resource usage of this solver, LoopInvGen+HE, relative to the original LoopInvGen with and without PLearn (Algorithm 1).

*Performance.* In Fig. 7(a), we show the number of failures across our six grammars for LoopInvGen, LoopInvGen+HE and LoopInvGen with PLearn, over our 180 benchmarks. LoopInvGen+HE has a significantly lower failure rate than LoopInvGen, and the number of failures decreases with grammar expressiveness. Thus, hybrid enumeration is a good proxy for PLearn.

Fig. 7. L LoopInvGen, <sup>H</sup> LoopInvGen+HE, <sup>P</sup> PLearn (LoopInvGen). <sup>H</sup> is not only significantly robust against increasing grammar expressiveness, but it also has a smaller total-time cost (τ) than P and a negligible overhead over L.

*Resource Usage.* To estimate how computationally expensive each solver is, we compare their *total-time cost* (τ). Since LoopInvGen and LoopInvGen+HE are single-threaded, for them we simply use the wall-clock time for synthesis as the total-time cost. However, for PLearn with p parallel instances of LoopInvGen, we consider the total-time cost as p times the wall-clock time for synthesis.

In Fig. 7(b), we show the median overhead (ratio of τ) incurred by PLearn over LoopInvGen+HE and LoopInvGen+HE over LoopInvGen, at various expressiveness levels. As we move to grammars of increasing expressiveness, the total-time cost of PLearn increases significantly, while the total-time cost of LoopInvGen+HE essentially matches that of LoopInvGen.

#### 5.3 Competition Performance

Finally, we evaluate the performance of LoopInvGen+HE on the benchmarks from the Inv track of the 2018 SyGuS competition [4], against the official winning solver, which we denote LIG [28]—a version of LoopInvGen [29] that has been extensively tuned for this track. In the competition, there are some invariantsynthesis problems where the postcondition itself is a satisfying expression. LIG starts with the postcondition as the first candidate and is extremely fast on such programs. For a fair comparison, we added this heuristic to LoopInvGen +HE as well. No other change was made to LoopInvGen+HE.

LoopInvGen solves 115 benchmarks in a total of 2191 seconds whereas LoopInvGen+HE solves 117 benchmarks in 429 seconds, for a mean speedup of over 5×. Moreover, no entrants to the competition could solve [4] the two additional benchmarks (gcnr\_tacas08 and fib\_20) that LoopInvGen+HE solves.

#### 6 Related Work

The most closely related work to ours investigates overfitting for verification tools [36]. Our work differs from theirs in several respects. First, we address the problem of overfitting in CEGIS-based synthesis. Second, we formally define overfitting and prove that all synthesizers must suffer from it, whereas they only observe overfitting empirically. Third, while they use cross-validation to combat overfitting in tuning a specific hyperparameter of a verifier, our approach is to search for solutions at different expressiveness levels.

The general problem of efficiently searching a large space of programs for synthesis has been explored in prior work. Lee et al. [24] use a probabilistic model, learned from known solutions to synthesis problems, to enumerate programs in order of their likelihood. Other approaches employ type-based pruning of large search spaces [26,32]. These techniques are orthogonal to, and may be combined with, our approach of exploring grammar subsets.

Our results are widely applicable to existing SyGuS tools, but some tools fall outside our purview. For instance, in programming-by-example (PBE) systems [18, §7], the specification consists of a set of input-output examples. Since any program that meets the given examples is a valid satisfying expression, our notion of overfitting does not apply to such tools. However in a recent work, Inala and Singh [19] show that incrementally increasing expressiveness can also aid PBE systems. They report that searching within increasingly expressive grammar subsets requires significantly fewer examples to find expressions that generalize better over unseen data. Other instances where the synthesizers can have a free lunch, i.e., always generate a solution with a small number of counterexamples, include systems that use grammars with limited expressiveness [16,21,35].

Our paper falls in the category of formal results about SyGuS. In one such result, Jha and Seshia [22] analyze the effects of different kinds of counterexamples and of providing bounded versus unbounded memory to learners. Notably, they do not consider variations in "concept classes" or "program templates," which are precisely the focus of our study. Therefore, our results are complementary: we treat counterexamples and learners as opaque and instead focus on grammars.

#### 7 Conclusion

Program synthesis is a vibrant research area; new and better synthesizers are being built each year. This paper investigates a general issue that affects all CEGIS-based SyGuS tools. We recognize the problem of overfitting, formalize it, and identify the conditions under which it must occur. Furthermore, we provide mitigating measures for overfitting that significantly improve the existing tools.

Acknowledgement. We thank Guy Van den Broeck and the anonymous reviewers for helpful feedback for improving this work, and the organizers of the SyGuS competition for making the tools and benchmarks publicly available.

This work was supported in part by the National Science Foundation (NSF) under grants CCF-1527923 and CCF-1837129. The lead author was also supported by an internship and a PhD Fellowship from Microsoft Research.

#### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### Proving Unrealizability for Syntax-Guided Synthesis

Qinheping Hu1(B) , Jason Breck<sup>1</sup>, John Cyphert<sup>1</sup>, Loris D'Antoni<sup>1</sup>, and Thomas Reps1,2

> <sup>1</sup> University of Wisconsin-Madison, Madison, USA qhu28@wisc.edu <sup>2</sup> GrammaTech, Inc., Ithaca, USA

Abstract. We consider the problem of automatically establishing that a given syntax-guided-synthesis (SyGuS) problem is *unrealizable* (i.e., has no solution). Existing techniques have quite limited ability to establish unrealizability for general SyGuS instances in which the grammar describing the search space contains infinitely many programs. By encoding the synthesis problem's grammar G as a nondeterministic program P*G*, we reduce the unrealizability problem to a reachability problem such that, if a standard program-analysis tool can establish that a certain assertion in P*<sup>G</sup>* always holds, then the synthesis problem is unrealizable.

Our method can be used to augment existing SyGuS tools so that they can establish that a successfully synthesized program q is *optimal* with respect to some syntactic cost—e.g., q has the fewest possible ifthen-else operators. Using known techniques, grammar G can be transformed to generate the set of all programs with lower costs than q—e.g., fewer conditional expressions. Our algorithm can then be applied to show that the resulting synthesis problem is unrealizable. We implemented the proposed technique in a tool called nope. nope can prove unrealizability for 59/132 variants of existing linear-integer-arithmetic SyGuS benchmarks, whereas all existing SyGuS solvers lack the ability to prove that these benchmarks are unrealizable, and time out on them.

#### 1 Introduction

The goal of program synthesis is to find a program in some search space that meets a specification—e.g., satisfies a set of examples or a logical formula. Recently, a large family of synthesis problems has been unified into a framework called *syntax-guided synthesis* (SyGuS). A SyGuS problem is specified by a regular-tree grammar that describes the search space of programs, and a logical formula that constitutes the behavioral specification. Many synthesizers now support a specific format for SyGuS problems [1], and compete in annual synthesis competitions [2]. Thanks to these competitions, these solvers are now quite mature and are finding a wealth of applications [9].

Consider the SyGuS problem to synthesize a function f that computes the maximum of two variables x and y, denoted by (ψmax2(f, x, y), G1). The goal is to create ef—an expression-tree for f—where e<sup>f</sup> is in the language of the following regular-tree grammar G1:

Start ::= Plus(Start, Start) <sup>|</sup> IfThenElse(BExpr, Start, Start) <sup>|</sup> <sup>x</sup> <sup>|</sup> <sup>y</sup> <sup>|</sup> <sup>0</sup> <sup>|</sup> <sup>1</sup> BExpr ::= GreaterThan(Start, Start) <sup>|</sup> Not(BExpr) <sup>|</sup> And(BExpr, BExpr)

and <sup>∀</sup>x, y.ψmax2([[e<sup>f</sup> ]], x, y) is valid, where [[e<sup>f</sup> ]] denotes the meaning of <sup>e</sup><sup>f</sup> , and

$$
\psi\_{\mathbf{max2}}(f, x, y) := f(x, y) \ge x \land f(x, y) \ge y \land (f(x, y) = x \lor f(x, y) = y).
$$

SyGuS solvers can easily find a solution, such as

e := IfThenElse(GreaterThan(x, y), x, y).

Although many solvers can now find solutions efficiently to many SyGuS problems, there has been effectively no work on the much harder task of proving that a given SyGuS problem is *unrealizable*—i.e., it does not admit a solution. For example, consider the SyGuS problem (ψmax2(f, x, y), G2), where G<sup>2</sup> is the more restricted grammar with if-then-else operators and conditions stripped out:

$$\text{Start } ::= \text{Plus(Start,Start)} \quad \mid x \mid y \mid 0 \mid 1$$

This SyGuS problem does *not* have a solution, because no expression generated by G<sup>2</sup> meets the specification.<sup>1</sup> However, to the best of our knowledge, current SyGuS solvers cannot prove that such a SyGuS problem is unrealizable.<sup>2</sup>

A key property of the previous example is that the grammar is infinite. When such a SyGuS problem is realizable, any search technique that systematically explores the infinite search space of possible programs will eventually identify a solution to the synthesis problem. In contrast, proving that a problem is unrealizable requires showing that *every* program in the *infinite* search space *fails to satisfy* the specification. This problem is in general undecidable [6]. Although we cannot hope to have an algorithm for establishing unrealizability, the challenge is to find a technique that succeeds for the kinds of problems encountered in practice. Existing synthesizers can detect the absence of a solution in certain cases (e.g., because the grammar is finite, or is infinite but only generate a finite number of functionally distinct programs). However, in practice, as our

<sup>2</sup> The synthesis problem presented above is one that is generated by a recent tool called QSyGuS, which extends SyGuS with quantitative syntactic objectives [10]. The advantage of using quantitative objectives in synthesis is that they can be used to produce higher-quality solutions—e.g., smaller, more readable, more efficient, etc. The synthesis problem (ψmax2(f, x, y), G2) arises from a QSyGuS problem in which the goal is to produce an expression that (i) satisfies the specification ψmax2(f, x, y), and (ii) uses the smallest possible number of if-then-else operators. Existing SyGuS solvers can easily produce a solution that uses one if-then-else operator, but cannot prove that no better solution exists—i.e., (ψmax2(f, x, y), G2) is unrealizable.

<sup>1</sup> Grammar G<sup>2</sup> only generates terms that are equivalent to some linear function of x and y; however, the maximum function cannot be described by a linear function.

experiments show, this ability is limited—no existing solver was able to show unrealizability for any of the examples considered in this paper.

In this paper, we present a technique for proving that a possibly infinite SyGuS problem is unrealizable. Our technique builds on two ideas.


The encoding mentioned in item 2 is non-trivial for three reasons. The following list explains each issue, and sketches how they are addressed

*(1) Infinitely many terms.* We need to model the infinitely many terms generated by the grammar of a given synthesis problem (ψ(f, x¯), G).

To address this issue, we use non-determinism and recursion, and give an encoding P[G, E] such that (i) each non-deterministic path p in the program P[G, E] corresponds to a possible expression e<sup>p</sup> that G can generate, and (ii) for each expression e that G can generate, there is a path p<sup>e</sup> in P[G, E]. (There is an isomorphism between paths and the expression-trees of G)

*(2) Nondeterminism.* We need the computation performed along each path p in P[G, E] to mimic the execution of expression ep. Because the program uses non-determinism, we need to make sure that, for a given path p in the program P[G, E], computational steps are carried out that mimic the evaluation of e<sup>p</sup> for *each* of the finitely many example inputs in E.

We address this issue by threading the expression-evaluation computations associated with each example in E through the *same* non-deterministic choices. *(3) Complex Specifications.* We need to handle specifications that allow for nested calls of the programs being synthesized.

For instance, consider the specification f(f(x)) = x. To handle this specification, we introduce a new variable y and rewrite the specification as f(x) = <sup>y</sup> <sup>∧</sup> <sup>f</sup>(y) = <sup>x</sup>. Because <sup>y</sup> is now also used as an input to <sup>f</sup>, we will thread both the computations of x and y through the non-deterministic recursive program.

Our work makes the following contributions:


Section 6 discusses related work. Some additional technical material, proofs, and full experimental results are given in [13].

#### 2 Illustrative Example

In this section, we illustrate the main components of our framework for establishing the unrealizability of a SyGuS problem.

Consider the SyGuS problem to synthesize a function f that computes the maximum of two variables x and y, denoted by (ψmax2(f, x, y), G1). The goal is to create ef—an expression-tree for f—where e<sup>f</sup> is in the language of the following regular-tree grammar G1:

Start ::= Plus(Start, Start) <sup>|</sup> IfThenElse(BExpr, Start, Start) <sup>|</sup> <sup>x</sup> <sup>|</sup> <sup>y</sup> <sup>|</sup> <sup>0</sup> <sup>|</sup> <sup>1</sup> BExpr ::= GreaterThan(Start, Start) <sup>|</sup> Not(BExpr) <sup>|</sup> And(BExpr, BExpr)

and <sup>∀</sup>x, y.ψmax2([[e<sup>f</sup> ]], x, y) is valid, where [[e<sup>f</sup> ]] denotes the meaning of <sup>e</sup><sup>f</sup> , and

$$
\psi\_{\max}(f, x, y) := f(x, y) \ge x \land f(x, y) \ge y \land (f(x, y) = x \lor f(x, y) = y).
$$

SyGuS solvers can easily find a solution, such as

e := IfThenElse(GreaterThan(x, y), x, y).

Although many solvers can now find solutions efficiently to many SyGuS problems, there has been effectively no work on the much harder task of proving that a given SyGuS problem is *unrealizable*—i.e., it does not admit a solution. For example, consider the SyGuS problem (ψmax2(f, x, y), G2), where G<sup>2</sup> is the more restricted grammar with if-then-else operators and conditions stripped out:

$$\text{Start } ::= \text{Plus(Start,Start) } \mid x \mid y \mid 0 \mid 1$$

This SyGuS problem does *not* have a solution, because no expression generated by G<sup>2</sup> meets the specification.<sup>3</sup> However, to the best of our knowledge, current

<sup>3</sup> Grammar G<sup>2</sup> generates all linear functions of x and y, and hence generates an infinite number of functionally distinct programs; however, the maximum function cannot be described by a linear function.

SyGuS solvers cannot prove that such a SyGuS problem is unrealizable. As an example, we use the problem (ψmax2(f, x, y), G2) discussed in Sect. 1, and show how unrealizability can be proven using four input examples: (0, 0), (0, 1), (1, 0), and (1, 1).

Fig. 1. Program P[G2, E1] created during the course of proving the unrealizability of (ψmax2(f, x, y), G2) using the set of input examples E<sup>1</sup> = *{*(0, 1)*}*.

Our method can be seen as a variant of Counter-Example-Guided Inductive Synthesis (CEGIS), in which the goal is to create a program P in which a certain assertion always holds. Until such a program is created, each round of the algorithm returns a counter-example, from which we extract an additional input example for the original SyGuS problem. On the i th round, the current set of input examples E<sup>i</sup> is used, together with the grammar—in this case G2 and the specification of the desired behavior—ψmax2(f, x, y), to create a candidate program P[G2, Ei]. The program P[G2, Ei] contains an assertion, and a standard program analyzer is used to check whether the assertion always holds.

Suppose that for the SyGuS problem (ψmax2(f, x, y), G2) we start with just the one example input (0, 1)—i.e., <sup>E</sup><sup>1</sup> <sup>=</sup> {(0, 1)}. Figure <sup>1</sup> shows the initial program <sup>P</sup>[G2, E1] that our method creates. The function spec implements the predicate <sup>ψ</sup>max2(f, x, y). (All of the programs {P[G2, Ei]} use the same function spec). The initialization statements "int x\_0 = 0; int y\_0 = 1;" at line (21) in procedure main correspond to the input example (0, 1). The recursive procedure Start encodes the productions of grammar <sup>G</sup>2. Start is nondeterministic; it contains four calls to an external function nd(), which returns a non-deterministically chosen Boolean value. The calls to nd() can be understood as controlling whether or not a production is selected from G<sup>2</sup> during a top-down, left-to-right generation of an expression-tree: lines (3)–(8) correspond to "Start ::= Plus(Start, Start)," and lines (10), (11), (12), and (13) correspond to "Start ::= x," "Start ::= y," "Start ::= 1," and "Start ::= 0," respectively. The code in the five cases in the body of Start encodes the semantics of the respective production of G2; in particular, the statements that are executed along any execution path of P[G2, E1] implement the *bottom-up evaluation of some expression-tree that can be generated by* G2. For instance, consider the path that visits statements in the following order (for brevity, some statement numbers have been elided):

$$\begin{array}{ccccccccc}21 & 22 & \text{( $\mathtt{start}$  } $ & 3 & 4 & \text{($ \mathtt{start} $ }$  & 10 & \text{)} \texttt{Start} & 6 & \text{( $\mathtt{start}$  } $ & 12 & \text{)} \texttt{start} & 8 & \text{)} \texttt{start} & 23, \text{ ($ 1)} \end{array}$$

where (Start and )Start indicate entry to, and return from, procedure Start, respectively. Path (1) corresponds to the top-down, left-to-right generation of the expression-tree Plus(x,1), interleaved with the tree's bottom-up evaluation.

Note that with path (1), when control returns to main, variable I\_0 has the value 1, and thus the assertion at line (23) fails.

A sound program analyzer will discover that some such path exists in the program, and will return the sequence of non-deterministic choices required to follow one such path. Suppose that the analyzer chooses to report path (1); the sequence of choices would be t, f, t, f, f, f, t, which can be decoded to create the expression-tree Plus(x,1). At this point, we have a candidate definition for <sup>f</sup>: f = x + 1. This formula can be checked using an SMT solver to see whether it satisfies the behavioral specification ψmax2(f, x, y). In this case, the SMT solver returns "false." One counter-example that it could return is (0, 0).

At this point, program P[G2, E2] would be constructed using both of the example inputs (0, 1) and (0, 0). Rather than describe P[G2, E2], we will describe the final program constructed, P[G2, E4] (see Fig. 2).

As can be seen from the comments in the two programs, program P[G2, E4] has the same basic structure as P[G2, E1].


The main difference is that because the encoding of <sup>G</sup><sup>2</sup> in Start uses nondeterminism, we need to make sure that along *each* path p in P[G2, E4], each of the example inputs is used to evaluate the *same* expression-tree. We address this issue by threading the expression-evaluation computations associated with each of the example inputs through the *same* non-deterministic choices. That is, each of the five "production cases" in Start has four encodings of the production's semantics—one for each of the four expression evaluations. By this means, the statements that are executed along path p perform *four simultaneous bottom-up evaluations* of the expression-tree from G<sup>2</sup> that corresponds to p.

Programs P[G2, E2] and P[G2, E3] are similar to P[G2, E4], but their paths carry out two and three simultaneous bottom-up evaluations, respectively. The

Fig. 2. Program P[G2, E4] created during the course of proving the unrealizability of (ψmax2(f, x, y), G2) using the set of input examples E<sup>4</sup> = *{*(0, 0), (0, 1), (1, 0), (1, 1)*}*.

actions taken during rounds 2 and 3 to generate a new counter-example—and hence a new example input—are similar to what was described for round 1. On round 4, however, the program analyzer will determine that the assertion on lines (34)–(35) always holds, which means that there is no path through P[G2, E4] for which the behavioral specification holds for all of the input examples. This property means that there is no expression-tree that satisfies the specification i.e., the SyGuS problem (ψmax2(f, x, y), G2) is unrealizable.

Our implementation uses the program-analysis tool SeaHorn [8] as the assertion checker. In the case of P[G2, E4], SeaHorn takes only 0.5 s to establish that the assertion in P[G2, E4] always holds.

#### 3 SyGuS, Realizability, and CEGIS

#### 3.1 Background

*Trees and Tree Grammars.* A *ranked alphabet* is a tuple (Σ, rkΣ) where Σ is a finite set of symbols and rk<sup>Σ</sup> : <sup>Σ</sup> <sup>→</sup> <sup>N</sup> associates a rank to each symbol. For every <sup>m</sup> <sup>≥</sup> <sup>0</sup>, the set of all symbols in <sup>Σ</sup> with rank <sup>m</sup> is denoted by <sup>Σ</sup>(m) . In our examples, a ranked alphabet is specified by showing the set Σ and attaching the respective rank to every symbol as a superscript—e.g., <sup>Σ</sup> <sup>=</sup> {+(2), c(0)}. (For brevity, the superscript is sometimes omitted). We use T<sup>Σ</sup> to denote the set of all (ranked) trees over <sup>Σ</sup>—i.e., <sup>T</sup><sup>Σ</sup> is the smallest set such that (*i*) <sup>Σ</sup>(0) <sup>⊆</sup> <sup>T</sup>Σ, (*ii*) if <sup>σ</sup>(k) <sup>∈</sup> <sup>Σ</sup>(k) and <sup>t</sup>1,...,t<sup>k</sup> <sup>∈</sup> <sup>T</sup>Σ, then <sup>σ</sup>(k) (t1, ··· , tk) <sup>∈</sup> <sup>T</sup>Σ. In what follows, we assume a fixed ranked alphabet (Σ, rkΣ).

In this paper, we focus on *typed* regular tree grammars, in which each nonterminal and each symbol is associated with a type. There is a finite set of types {τ1,...,τ<sup>k</sup>}. Associated with each symbol <sup>σ</sup>(i) <sup>∈</sup> <sup>Σ</sup>(i), there is a type assignment a<sup>σ</sup>(*i*) = (τ0, τ1,...,τi), where τ<sup>0</sup> is called the *left-hand-side type* and τ1,...,τ<sup>i</sup> are called the *right-hand-side types*. Tree grammars are similar to word grammars, but generate trees over a ranked alphabet instead of words.

Definition 1 (Regular Tree Grammar). *A* typed regular tree grammar *(RTG) is a tuple* G = (N, Σ, S, a, δ)*, where* N *is a finite set of non-terminal symbols of arity 0;* <sup>Σ</sup> *is a ranked alphabet;* <sup>S</sup> <sup>∈</sup> <sup>N</sup> *is an initial non-terminal;* <sup>a</sup> *is a type assignment that gives types for members of* <sup>Σ</sup> <sup>∪</sup> <sup>N</sup>*; and* <sup>δ</sup> *is a finite set of productions of the form* <sup>A</sup><sup>0</sup> <sup>→</sup> <sup>σ</sup>(i) (A1,...,Ai)*, where for* <sup>1</sup> <sup>≤</sup> <sup>j</sup> <sup>≤</sup> <sup>i</sup>*, each* <sup>A</sup><sup>j</sup> <sup>∈</sup> <sup>N</sup> *is a non-terminal such that if* <sup>a</sup>(σ(i))=(τ0, τ1,...,τi) *then* <sup>a</sup>(A<sup>j</sup> ) = <sup>τ</sup><sup>j</sup> *.*

In a SyGuS problem, each variable, such as x and y in the example RTGs in Sect. 1, is treated as an arity-0 symbol—i.e., x(0) and y(0).

Given a tree <sup>t</sup> <sup>∈</sup> <sup>T</sup><sup>Σ</sup>∪<sup>N</sup> , applying a production <sup>r</sup> <sup>=</sup> <sup>A</sup> <sup>→</sup> <sup>β</sup> to <sup>t</sup> produces the tree t resulting from replacing the left-most occurrence of A in t with the right-hand side <sup>β</sup>. A tree <sup>t</sup> <sup>∈</sup> <sup>T</sup><sup>Σ</sup> is generated by the grammar <sup>G</sup>—denoted by <sup>t</sup> <sup>∈</sup> <sup>L</sup>(G)—iff it can be obtained by applying a sequence of productions <sup>r</sup><sup>1</sup> ··· <sup>r</sup><sup>n</sup> to the tree whose root is the initial non-terminal S.

*Syntax-Guided Synthesis.* A SyGuS problem is specified with respect to a background theory T—e.g., linear arithmetic—and the goal is to synthesize a function f that satisfies two constraints provided by the user. The first constraint, ψ(f, x¯), describes a *semantic property* that f should satisfy. The second constraint limits the *search space* S of f, and is given as a set of expressions specified by an RTG G that defines a subset of all terms in T.

Definition 2 (SyGuS). *<sup>A</sup>* SyGuS *problem over a background theory* <sup>T</sup> *is a pair* sy = (ψ(f, x¯), G) *where* G *is a regular tree grammar that only contains terms in* <sup>T</sup>*—i.e.,* <sup>L</sup>(G) <sup>⊆</sup> <sup>T</sup>*—and* <sup>ψ</sup>(f, <sup>x</sup>¯) *is a Boolean formula constraining the semantic behavior of the synthesized program* f*.*

*<sup>A</sup>* SyGuS *problem is* realizable *if there exists a expression* <sup>e</sup> <sup>∈</sup> <sup>L</sup>(G) *such that* <sup>∀</sup>x.ψ¯ ([[e]], <sup>x</sup>¯) *is true. Otherwise we say that the problem is* unrealizable*.*

Theorem 1 (Undecidability [6]). *Given a* SyGuS *problem* sy*, it is undecidable to check whether* sy *is realizable.*

*Counterexample-Guided Inductive Synthesis.* The Counterexample-Guided Inductive Synthesis (CEGIS) algorithm is a popular approach to solving synthesis problems. Instead of directly looking for an expression that satisfies the specification ϕ on *all* possible inputs, the CEGIS algorithm uses a synthesizer S that can find expressions that are correct on a *finite* set of examples E. If S finds a solution that is correct on all elements of E, CEGIS uses a verifier V to check whether the discovered solution is also correct for all possible inputs to the problem. If not, a counterexample obtained from V is added to the set of examples, and the process repeats. More formally, CEGIS starts with an empty set of examples E and repeats the following steps:


Because SyGuS problems are only defined over first-order decidable theories, any SMT solver can be used as the verifier V to check whether the formula <sup>¬</sup>ψ([[e]], <sup>x</sup>¯) is satisfiable. On the other hand, providing a synthesizer <sup>S</sup> to find solutions such that <sup>∀</sup>x¯ <sup>∈</sup> E.ψ([[e]], <sup>x</sup>¯) holds is a much harder problem because e is a second-order term drawn from an infinite search space. In fact, checking whether such an e exists is an undecidable problem [6].

The main contribution of our paper is a reduction of the unrealizability problem—i.e., the problem of proving that there is no expression <sup>e</sup> <sup>∈</sup> <sup>L</sup>(G) such that <sup>∀</sup>x¯ <sup>∈</sup> E.ψ([[e]], <sup>x</sup>¯) holds—to an unreachability problem (Sect. 4). This reduction allows us to use existing (un)reachability verifiers to check whether a SyGuS instance is unrealizable.

### 3.2 CEGIS and Unrealizability

The CEGIS algorithm is sound but incomplete for proving unrealizability. Given a SyGuS problem sy = (ψ(f, x¯), G) and a finite set of inputs E, we denote with sy<sup>E</sup> := (ψ<sup>E</sup>(f, x¯), G) the corresponding SyGuS problem that only requires the function f to be correct on the examples in E.

#### Lemma 1 (Soundness). *If* sy<sup>E</sup> *is unrealizable then* sy *is unrealizable.*

Even when given a perfect synthesizer S—i.e., one that can solve a problem sy<sup>E</sup> for every possible set E—there are SyGuS problems for which the CEGIS algorithm is not powerful enough to prove unrealizability.

Lemma 2 (Incompleteness). *There exists an unrealizable* SyGuS *problem* sy *such that for every finite set of examples* E *the problem* sy<sup>E</sup> *is realizable.*

Despite this negative result, we will show that a CEGIS algorithm can prove unrealizability for many SyGuS instances (Sect. 5).

#### 4 From Unrealizability to Unreachability

In this section, we show how a SyGuS problem for finitely many examples can be reduced to a reachability problem in a non-deterministic, recursive program in an imperative programming language.

#### 4.1 Reachability Problems

A *program* P takes an initial state I as input and outputs a final state O, i.e., [[P]](I) = <sup>O</sup> where [[·]] denotes the semantic function of the programming language. As illustrated in Sect. 2, we allow a program to contain calls to an external function nd(), which returns a non-deterministically chosen Boolean value. When program <sup>P</sup> contains calls to nd(), we use <sup>P</sup><sup>ˆ</sup> to denote the program that is the same as <sup>P</sup> except that <sup>P</sup><sup>ˆ</sup> takes an additional integer input n, and each call nd() is replaced by a call to a local function nextbit() defined as follows:

```
bool nextbit(){bool b = n%2; n=n»1; return b;}.
```
In other words, the integer parameter n of <sup>P</sup>ˆ[n] formalizes all of the nondeterministic choices made by <sup>P</sup> in calls to nd().

For the programs P[G, E] used in our unrealizability algorithm, the only calls to nd() are ones that control whether or not a production is selected from grammar G during a top-down, left-to-right generation of an expression-tree. Given n, we can decode it to identify which expression-tree n represents.

*Example 1.* Consider again the SyGuS problem (ψmax2(f, x, y), G2) discussed in Sect. 2. In the discussion of the initial program P[G2, E1] (Fig. 1), we hypothesized that the program analyzer chose to report path (1) in P, for which the sequence of non-deterministic choices is t, f, t, f, f, f, t. That sequence means that for <sup>P</sup>ˆ[n], the value of n is 1000101 (base 2) (or 69 (base 10)). The <sup>1</sup>s, from loworder to high-order position, represent choices of production instances in a topdown, left-to-right generation of an expression-tree. (The 0s represent rejected possible choices). The rightmost <sup>1</sup> in n corresponds to the choice in line (3) of "Start ::= Plus(Start, Start)"; the <sup>1</sup> in the third-from-rightmost position corresponds to the choice in line (10) of "Start ::= x" as the left child of the Plus node; and the <sup>1</sup> in the leftmost position corresponds to the choice in line (12) of "Start ::= 1" as the right child. By this means, we learn that the behavioral specification <sup>ψ</sup>max2(f, x, y) holds for the example set <sup>E</sup><sup>1</sup> <sup>=</sup> {(0, 1)} for <sup>f</sup> <sup>→</sup> Plus(x,1). 

Definition 3 (Reachability Problem). *Given a program* Pˆ[*n*]*, containing assertion statements and a non-deterministic integer input n, we use* re<sup>P</sup> *to denote the corresponding reachability problem. The reachability problem* re<sup>P</sup> *is satisfiable if there exists a value* n *that, when bound to n, falsifies any of the assertions in* Pˆ[*n*]*. The problem is unsatisfiable otherwise.*

#### 4.2 Reduction to Reachability

The main component of our framework is an encoding *enc* that given a SyGuS problem sy<sup>E</sup> = (ψE(f,x), G) over a set of examples <sup>E</sup> <sup>=</sup> {c1,...,ck}, outputs a program P[G, E] such that sy<sup>E</sup> is realizable if and only if re*enc*(sy,E) is satisfiable. In this section, we define all the components of P[G, E], and state the correctness properties of our reduction.

*Remark:* In this section, we assume that in the specification ψ(f,x) every occurrence of f has x as input parameter. We show how to overcome this restriction in App. A [13]. In the following, we assume that the input x has type τ<sup>I</sup> , where τ<sup>I</sup> could be a complex type—e.g., a tuple type.

*Program Construction.* Recall that the grammar G is a tuple (N, Σ, S, a, δ). First, for each non-terminal <sup>A</sup> <sup>∈</sup> <sup>N</sup>, the program <sup>P</sup>[G, E] contains <sup>k</sup> global variables {g\_1\_A,..., g\_k\_A} of type <sup>a</sup>(A) that are used to express the values resulting from evaluating expressions generated from non-terminal A on the k examples. Second, for each non-terminal <sup>A</sup> <sup>∈</sup> <sup>N</sup>, the program <sup>P</sup>[G, E] contains a function

$$\begin{array}{c} \text{void } \mathsf{funcA} \{ \tau\_I \mid \mathsf{v1}, \dots, \tau\_I \mid \mathsf{vk} \} \{ \quad bodyA \ \} \end{array}$$

We denote by <sup>δ</sup>(A) = {r1,...,r<sup>m</sup>} the set of production rules of the form <sup>A</sup> <sup>→</sup> <sup>β</sup> in <sup>δ</sup>. The body bodyA of funcA has the following structure:

```
if(nd()) {Enδ(r1)}
else if(nd()) {Enδ(r2)}
...
else {Enδ(rm)}
```
The encoding En<sup>δ</sup>(r) of a production <sup>r</sup> <sup>=</sup> <sup>A</sup><sup>0</sup> <sup>→</sup> <sup>b</sup>(j) (A1, ··· , A<sup>j</sup> ) is defined as follows (τ<sup>i</sup> denotes the type of the term Ai):

> funcA1(v1,...,vk); <sup>τ</sup><sup>1</sup> child\_1\_1 <sup>=</sup> g\_1\_A1; ... ; <sup>τ</sup><sup>1</sup> child\_1\_k <sup>=</sup> g\_k\_Aj; ... funcAj(v1,...,vk); <sup>τ</sup><sup>j</sup> child\_j\_1 <sup>=</sup> g\_1\_A1; ... ; <sup>τ</sup><sup>j</sup> child\_j\_k <sup>=</sup> g\_k\_Aj; g\_1\_A0 <sup>=</sup> *enc*<sup>1</sup> <sup>b</sup> (child\_1\_1,..., child\_1\_k) ... g\_k\_A0 <sup>=</sup> *enc*<sup>k</sup> <sup>b</sup> (child\_j\_1,..., child\_j\_k)

Note that if b(j) is of arity 0—i.e., if j = 0—the construction yields k assignments of the form g\_m\_A0 <sup>=</sup> *enc*<sup>m</sup> <sup>b</sup> ().

The function *enc*<sup>m</sup> <sup>b</sup> interprets the semantics of b on the mth input example. We take Linear Integer Arithmetic as an example to illustrate how *enc*<sup>m</sup> <sup>b</sup> works.

$$\begin{array}{lcl}enc\_{0^{(0)}}^{m} & \coloneqq \mathtt{0} & enc\_{1^{(0)}}^{m} & \coloneqq \mathtt{1} \\ enc\_{\mathtt{x}^{(0)}}^{m} & \coloneqq \mathtt{v} \mathtt{i} & enc\_{\mathtt{E}\_{\mathtt{E}\mathtt{q}}\mathtt{i}\mathtt{s}^{(2)}}^{m}(L,R) & \coloneqq (L\mathtt{\bullet}R) \\ enc\_{\mathtt{Plus}^{(2)}}^{m}(L,R) & \coloneqq \mathtt{L}\mathtt{\star}R & enc\_{\mathtt{M}\mathtt{im}\mathtt{s}^{(2)}}^{m}(L,R) & \coloneqq \mathtt{L}\mathtt{\star}R \\ enc\_{\mathtt{I}\mathtt{I}\mathtt{Then}\mathtt{E}\mathtt{l}\mathtt{e}^{(3)}}^{m}(B,L,R) & \coloneqq \mathtt{i}\mathtt{f}(B)\mathtt{L}\mathtt{\star}\mathtt{1}\mathtt{s}\mathtt{e}\mathtt{R} \end{array}$$

We now turn to the correctness of the construction. First, we formalize the relationship between expression-trees in L(G), the semantics of P[G, E], and the number n. Given an expression-tree <sup>e</sup>, we assume that each node <sup>q</sup> in <sup>e</sup> is annotated with the production that has produced that node. Recall that δ(A) = {r1,...,rm} is the set of productions with head <sup>A</sup> (where the subscripts are indexes in some arbitrary, but fixed order). Concretely, for every node q, we assume there is a function pr(q)=(A, i), which associates q with a pair that indicates that non-terminal A produced n using the production r<sup>i</sup> (i.e., r<sup>i</sup> is the i th production whose left-hand-side non-terminal is A).

We now define how we can extract a number #(e) for which the program Pˆ[#(e)] will exhibit the same semantics as that of the expression-tree e. First, for every node q in e such that pr(q)=(A, i), we define the following number:

$$\#\_{nd}(q) = \begin{cases} 1 \underbrace{0 \cdots 0}\_{i-1} & \text{if } i < |\delta(A)| \\ \underbrace{0 \cdots 0}\_{i-1} & \text{if } i = |\delta(A)|. \end{cases}$$

The number #nd(q) indicates what suffix of the value of n will cause funcA to trigger the code corresponding to production <sup>r</sup>i. Let <sup>q</sup><sup>1</sup> ··· <sup>q</sup><sup>m</sup> be the sequence of nodes visited during a pre-order traversal of expression-tree e. The number corresponding to <sup>e</sup>, denoted by #(e), is defined as the bit-vector #nd(qm)··· #nd(q1).

Finally, we add the entry-point of the program, which calls the function funcS corresponding to the initial non-terminal S, and contains the assertion that encodes our reachability problem on all the input examples <sup>E</sup> <sup>=</sup> {c1,...,c<sup>k</sup>}.

$$\text{void } \mathsf{main}() \{ \\\tau\_I \ge 1 = c\_1; \dots; \tau\_I \ge k = c\_k; \\\ \mathsf{funcS}(\mathtt{x1}, \dots, \mathtt{xk}); \\\ \mathsf{assert} \bigvee\_{1 \le i \le k} \neg \psi(f, c\_i) [\mathtt{g}, \mathtt{i}\_\mathsf{S}/f(x)]; \\\ \text{' } / \text{At 1 } \mathtt{least} \text{ one } c\_i \text{ fails} \} $$

*Correctness.* We first need to show that the function #(·) captures the correct language of expression-trees. Given a non-terminal A, a value n, and input values <sup>i</sup>1,...,ik, we use [[funcA[n]]](i1,...,ik)=(o1,...ok) to denote the values of the variables {g\_1\_A,..., g\_k\_A} at the end of the execution of funcA[n] with the initial value of n <sup>=</sup> <sup>n</sup> and input values <sup>x</sup>1,...,xk. Given a non-terminal <sup>A</sup>, we write L(G, A) to denote the set of terms that can be derived starting with A.

Lemma 3. *Let* <sup>A</sup> *be a non-terminal,* <sup>e</sup> <sup>∈</sup> <sup>L</sup>(G, A) *an expression, and* {i1,...,i<sup>k</sup>} *an input set. Then,* ([[e]](i1),..., [[e]](ik)) = [[*funcA*[#(e)]]](i1,...,ik)*.*

Each procedure funcA[n](i1,...,ik) that we construct has an explicit dependence on variable n, where n controls the non-deterministic choices made by the funcA and procedures called by funcA. As a consequence, when relating numbers and expression-trees, there are two additional issues to contend with:

Non-termination. Some numbers can cause funcA[n] to fail to terminate. For instance, if the case for "Start ::= Plus(Start, Start)" in program P[G2, E1] from Fig. 1 were moved from the first branch (lines (3)–(8)) to the final else case (line (13)), the number n =0= ... 0000000 (base 2) would cause Start to never terminate, due to repeated selections of Plus nodes. However, note that the only assert statement in the program is placed at the end of the main procedure. Now, consider a value of n such that re*enc*(sy,E) is satisfiable. Definition 3 implies that the flow of control will reach and falsify the assertion, which implies that funcA[n] terminates.<sup>4</sup>

Shared suffixes of sufficient length. In Example 1, we showed how for program <sup>P</sup>[G2, E1] (Fig. 1) the number <sup>n</sup> = 1000101 (base 2) corresponds to the top-down, left-to-right generation of Plus(x,1). That derivation consumed exactly seven bits; thus, any number that, written in base 2, shares the suffix <sup>1000101</sup>—e.g., <sup>11010101000101</sup>—will also generate Plus(x,1).

The issue of shared suffixes is addressed in the following lemma:

Lemma 4. *For every non-terminal* A *and number n such that* [[*funcA*[n]]](i1,...,ik) <sup>=</sup> <sup>⊥</sup> *(i.e., funcA terminates when the non-deterministic choices are controlled by n), there exists a minimal n that is a (base* 2*) suffix of <sup>n</sup> for which (i) there is an* <sup>e</sup> <sup>∈</sup> <sup>L</sup>(G) *such that* #(e) = <sup>n</sup> *, and (ii) for every input* {i1,...,i<sup>k</sup>}*, we have* [[*funcA*[n]]](i1,...,ik) = [[*funcA*[n ]]](i1,...,ik)*.*

We are now ready to state the correctness property of our construction.

Theorem 2. *Given a* SyGuS *problem* sy<sup>E</sup> = (ψE(f,x), G) *over a finite set of examples* E*, the problem* sy<sup>E</sup> *is realizable iff* re*enc*(sy,E) *is satisfiable.*

#### 5 Implementation and Evaluation

nope is a tool that can return two-sided answers to unrealizability problems of the form sy = (ψ,G). When it returns *unrealizable*, no expression-tree in L(G) satisfies <sup>ψ</sup>; when it returns *realizable*, some <sup>e</sup> <sup>∈</sup> <sup>L</sup>(G) satisfies <sup>ψ</sup>; nope can also time out. nope incorporates several existing pieces of software.


It is important to observe that SeaHorn, like most reachability verifiers, is only sound for *un*satisfiability—i.e., if SeaHorn returns *unsatisfiable*, the reachability problem is indeed unsatisfiable. Fortunately, SeaHorn's one-sided

<sup>4</sup> If the SyGuS problem deals with the synthesis of programs for a language that can express non-terminating programs, that would be an additional source of nontermination, different from that discussed in item Non-termination. That issue does not arise for LIA SyGuS problems. Dealing with the more general kind of non-termination is postponed for future work.

answers are in the correct direction for our application: to prove unrealizability, nope only requires the reachability verifier to be sound for unsatisfiability.

There is one aspect of nope that differs from the technique that has been presented earlier in the paper. While SeaHorn is sound for *un*reachability, it is not sound for reachability—i.e., it cannot soundly prove whether a synthesis problem is realizable. To address this problem, to check whether a given SyGuS problem sy<sup>E</sup> is realizable on the finite set of examples E, nope also calls the SyGuS solver ESolver [2] to synthesize an expression-tree e that satisfies syE. 5

In practice, for every intermediate problem sy<sup>E</sup> generated by the CEGIS algorithm, nope runs the ESolver on sy<sup>E</sup> and SeaHorn on re*enc*(sy,E) in *parallel*. If ESolver returns a solution e, SeaHorn is interrupted, and Z3 is used to check whether e satisfies ψ. Depending on the outcome, nope either returns *realizable* or obtains an additional input example to add to E. If SeaHorn returns *unsatisfiable*, nope returns *unrealizable*.

Modulo bugs in its constituent components, nope is sound for both realizability and unrealizability, but because of Lemma 2 and the incompleteness of SeaHorn, nope is not complete for unrealizability.

Benchmarks. We perform our evaluation on 132 variants of the 60 LIA benchmarks from the LIA SyGuS competition track [2]. We do not consider the other SyGuS benchmark track, Bit-Vectors, because the SeaHorn verifier is unsound for most bit-vector operations–e.g., bit-shifting. We used three suites of benchmarks. LimitedIf (resp. LimitedPlus) contains 57 (resp. 30) benchmarks in which the grammar bounds the number of times an IfThenElse (resp. Plus) operator can appear in an expression-tree to be 1 less than the number required to solve the original synthesis problem. We used the tool Quasi to automatically generate the restricted grammars. LimitedConst contains 45 benchmarks in which the grammar allows the program to contain only constants that are coprime to any constants that may appear in a valid solution—e.g., the solution requires using odd numbers, but the grammar only contains the constant 2. The numbers of benchmarks in the three suites differ because for certain benchmarks it did not make sense to create a limited variant—e.g., if the smallest program consistent with the specification contains no IfThenElse operators, no variant is created for the LimitedIf benchmark. In all our benchmarks, the grammars describing the search space contain infinitely many terms.

Our experiments were performed on an Intel Core i7 4.00 GHz CPU, with 32 GB of RAM, running Lubuntu 18.10 via VirtualBox. We used version 4.8 of Z3, commit 97f2334 of SeaHorn, and commit d37c50e of ESolver. The timeout for each individual SeaHorn/ESolver call is set at 10 min.

Experimental Questions. Our experiments were designed to answer the questions posed below.

EQ 1. Can nope prove unrealizability for variants of real SyGuS benchmarks, and how long does it take to do so?

<sup>5</sup> We chose ESolver because on the benchmarks we considered, ESolver outperformed other SyGuS solvers (e.g., CVC4 [3]).

*Finding:* nope*can prove unrealizability for* 59/132 *benchmarks.* For the 59 benchmarks solved by nope, the average time taken is 15.59 s. The time taken to perform the last iteration of the algorithm—i.e., the time taken by SeaHorn to return unsatisfiable—accounts for 87% of the total running time.

nope can solve all of the LimitedIf benchmarks for which the grammar allows at most one IfThenElse operator. Allowing more IfThenElse operators in the grammar leads to larger programs and larger sets of examples, and consequently the resulting reachability problems are harder to solve for SeaHorn.

For a similar reason, nope can solve only one of the LimitedPlus benchmarks. All other LimitedPlus benchmarks allow 5 or more Plus statements, resulting in grammars that have at least 130 productions.

nope can solve all LimitedConst benchmarks because these require few examples and result in small encoded programs.

EQ 2. How many examples does nope use to prove unrealizability and how does the number of examples affect the performance of nope?

Note that Z3 can produce different models for the same query, and thus different runs of NOPE can produce different sequences of examples. Hence, there is no guarantee that NOPE finds a good sequence of examples that prove unrealizability. One measure of success is whether nope is generally able to find a small number of examples, when it succeeds in proving unrealizability.

*Finding: Nope used 1 to 9 examples to prove unrealizability for the benchmarks on which it terminated.* Problems requiring large numbers of examples could not be solved because either ESolver or times out—e.g., on the problem max4, nope gets to the point where the CEGIS loop has generated 17 examples, at which point ESolver exceeds the timeout threshold.

*Finding: The number of examples required to prove unrealizability depends mainly on the arity of the synthesized function and the complexity of the grammar.* The number of examples seems to grow quadratically with the number of bounded operators allowed in the grammar. In particular, problems in which the grammar allows zero IfThenElse operators require 2–4 examples, while problems in which the grammar allows one IfThenElse operator require 7–9 examples.

Figure 3 plots the running time of nope against the number of examples generated by the CEGIS algorithm. *Finding: The solving time appears to grow exponentially with the number of examples* required to prove unrealizability.

#### 6 Related Work

The SyGuS formalism was introduced as a unifying framework to express several synthesis problems [1]. Caulfield et al. [6] proved that it is undecidable to determine whether a given SyGuS problem is realizable. Despite this negative result, there are several SyGuS solvers that compete in yearly SyGuS competitions [2] and can efficiently produce solutions to SyGuS problems when a solution exists. Existing SyGuS synthesizers fall into three categories: (*i*) Enumeration solvers enumerate programs with respect to a given total order [7]. If the given problem is unrealizable, these solvers typically only terminate if the language of the grammar is finite or contains finitely many functionally distinct programs. While in principle certain enumeration solvers can prune infinite portions of the search space, none of these solvers could prove unrealizability for any of the benchmarks considered in this paper. (*ii*) Symbolic solvers reduce the synthesis problem to a constraint-solving problem [3]. These solvers cannot reason about grammars that restrict allowed terms, and resort to enumeration whenever the candidate solution produced by the constraint solver is not in the restricted search space. Hence, they also cannot prove unrealizability. (*iii*) Probabilistic synthesizers randomly search the search space, and are typically unpredictable [14], providing no guarantees in terms of unrealizability.

*Synthesis as Reachability.* CETI [12] introduces a technique for encoding template-based synthesis problems as reachability problems. The CETI encoding only applies to the specific setting in which (*i*) the search space is described by an imperative program with a *finite number* of holes—i.e., the values that the synthesizer has to discover—and (*ii*) the specification is given as a finite number of input-output test cases with which the target program should agree. Because the number of holes is finite, and all holes correspond to values (and not terms), the reduction to a reachability problem only involves making the holes global variables in the program (and no more elaborate transformations).

In contrast, our reduction technique handles search spaces that are described by a grammar, which in general consist of an infinite set of terms (not just values). Due to this added complexity, our encoding has to account for (i) the semantics of the productions in the grammar, and (ii) the use of non-determinism to encode the choice of grammar productions. Our encoding creates one expressionevaluation computation for each of the example inputs, and threads these computations through the program so that each expression-evaluation computation makes use of the *same* set of non-deterministic choices.

Using the input-threading, our technique can handle specifications that contain nested calls of the synthesized program (e.g., f(f(x)) = x). (App. A [13]).

The input-threading technique builds a *product program* that performs multiple executions of the same function as done in relational program verification [4]. Alternatively, a different encoding could use multiple function invocations on individual inputs and require the verifier to thread the same bit-stream for all input evaluations. In general, verifiers perform much better on product programs [4], which motivates our choice of encoding.

*Unrealizability in Program Synthesis.* For certain synthesis problems—e.g., reactive synthesis [5]—the realizability problem is decidable. The framework tackled in this paper, SyGuS, is orthogonal to such problems, and it is undecidable to check whether a given SyGuS problem is realizable [6].

Mechtaev et al. [11] propose to use a variant of SyGuS to efficiently prune irrelevant paths in a symbolic-execution engine. In their approach, for each path π in the program, a synthesis problem p<sup>π</sup> is generated so that if p<sup>π</sup> is unrealizable, the path π is infeasible. The synthesis problems generated by Mechtaev et al. (which are not directly expressible in SyGuS) are decidable because the search space is defined by a finite set of templates, and the synthesis problem can be encoded by an SMT formula. To the best of our knowledge, our technique is the first one that can check unrealizability of general SyGuS problems in which the search space is an *infinite set of functionally distinct terms*.

Acknowledgment. This work was supported, in part, by a gift from Rajiv and Ritu Batra; by AFRL under DARPA MUSE award FA8750-14-2-0270 and DARPA STAC award FA8750-15-C-0082; by ONR under grant N00014-17-1-2889; by NSF under grants CNS-1763871 and CCF-1704117; and by the UW-Madison OVRGE with funding from WARF.

#### References


352 Q. Hu et al.


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### Model Checking

### **BMC for Weak Memory Models: Relation Analysis for Compact SMT Encodings**

Natalia Gavrilenko1,4(B) , Hern´an Ponce-de-Le´on<sup>2</sup>, Florian Furbach<sup>3</sup>, Keijo Heljanko<sup>4</sup>, and Roland Meyer<sup>3</sup>

 Aalto University, Helsinki, Finland fortiss GmbH, Munich, Germany TU Braunschweig, Brunswick, Germany University of Helsinki and HIIT, Helsinki, Finland natalia.gavrilenko@helsinki.fi

**Abstract.** We present Dartagnan, a bounded model checker (BMC) for concurrent programs under weak memory models. Its distinguishing feature is that the memory model is not implemented inside the tool but taken as part of the input. Dartagnan reads CAT, the standard language for memory models, which allows to define x86/TSO, ARMv7, ARMv8, Power, C/C++, and Linux kernel concurrency primitives. BMC with memory models as inputs is challenging. One has to encode into SMT not only the program but also its semantics as defined by the memory model. What makes Dartagnan scale is its relation analysis, a novel static analysis that significantly reduces the size of the encoding. Dartagnan matches or even exceeds the performance of the model-specific verification tools Nidhugg and CBMC, as well as the performance of Herd, a CAT-compatible litmus testing tool. Compared to the unoptimized encoding, the speed-up is often more than two orders of magnitude.

**Keywords:** Weak memory models · CAT · Concurrency · BMC · SMT

#### **1 Introduction**

When developing concurrency libraries or operating system kernels, performance and scalability of the concurrency primitives is of paramount importance. These primitives rely on the synchronization guarantees of the underlying hardware and the programming language runtime environment. The formal semantics of these guarantees are often defined in terms of weak memory models. There is considerable interest in verification tools that take memory models into account [5,9,13,22].

A successful approach to formalizing weak memory models is CAT [11,12,16], a flexible specification language in which all memory models considered so far can be expressed succinctly. CAT, together with its accompanying tool Herd [4], has been used to formalize the semantics not only of assembly for x86/TSO, Power, ARMv7 and ARMv8, but also high-level programming languages, such as C/C++, transactional memory extensions, and recently the Linux kernel concurrency primitives [11,15,16,18,20,24,29]. This success indicates the need for universal verification tools that are not limited to a specific memory model.

We present Dartagnan [3], a bounded model checker that takes memory models as inputs. Dartagnan expects a concurrent program annotated with an assertion and a memory model for which the verification should be conducted. It verifies the assertion on those executions of the program that are valid under the given memory model and returns a counterexample execution if the verification fails. As is typical of BMC, the verification results hold relative to an unrolling bound [21]. The encoding phase, however, is new. Not only the program but also its semantics as defined by the CAT model are translated into an SMT formula.

Having to take into account the semantics quickly leads to large encodings. To overcome this problem, Dartagnan implements a novel *relation analysis*, which can be understood as a static analysis of the program semantics as defined by the memory model. More precisely, CAT defines the program semantics in terms of relations between the events that may occur in an execution. Depending on constraints over these relations, an execution is considered valid or invalid. Relation analysis determines the pairs of events that may influence a constraint of the memory model. Any remaining pair can be dropped from the encoding. The analysis is compatible with optimized fixpoint encodings presented in [27,28].

The second novelty is the support for advanced programming constructs. We redesigned Dartagnan's heap model, which now has pointers and arrays. Furthermore, we enriched the set of synchronization primitives, including readmodify-write and read-copy-update (RCU) instructions [26]. One motivation for this richer set of programming constructs is the Linux kernel memory model [15] that has recently been added to the kernel documentation [2]. This model has already been used by kernel developers to find bugs in and clarify details of the concurrency primitives. Since the model is expected to be refined with further development of the kernel, verification tools will need to quickly accommodate updates in the specification. So far, only Herd [4] has satisfied this requirement. Unfortunately, it is limited to fairly small programs (litmus tests). The present version of Dartagnan offers an alternative with substantially better performance.

We present experiments on a series of benchmarks consisting of 4751 Linux litmus tests and 7 mutual exclusion algorithms executed on TSO, ARM, and Linux. Despite the flexibility of taking memory models as inputs, Dartagnan's performance is comparable to CBMC [13] and considerably better than that of Nidhugg [5,9]. Both are model-specific tools. Compared to the previous version of Dartagnan [28] and compared to Herd [4], we gain a speed-up of more than two orders of magnitude, thanks to the relation analysis.

**Related Work.** In terms of the verification task to be solved, the following tools are the closest to ours. CBMC [13] is a scalable bounded model checker supporting TSO, but not ARM. An earlier version also supported Power. Nidhugg [5,9] is a stateless model checker supporting TSO, Power, and a subset of ARMv7. It is excellent for programs with a small number of executions. RCMC [22] implements a stateless model checking algorithm targeting C11. We cannot directly benchmark against it because the source code of the tool is not yet publicly available, nor do we fully support C11. Herd [4] is the only tool aside from ours that takes a CAT memory model as input. Herd does not scale well to programs with a large number of executions, including some of the Linux kernel tests. Other verification tasks (e.g., fence insertion to restore sequential consistency) are tackled by Memorax [6–8], offence [14], Fender [23], DFence [25], and trencher [19].

**Relation Analysis on an Example.** Consider the program (in the .litmus format) given to the left in the figure below. The assertion asks whether there is a reachable state with final values EBX = 1, ECX = 0. We analyze the program under the x86-TSO memory model shown below the program. The semantics of the program under TSO is a set of executions. An execution is a graph, similar to the one given below, where the nodes are events and the edges correspond to the relations defined by the memory model. Events are instances of instructions that access the shared memory: R (loads), W (stores, including initial stores), and M (the union of both). The atomic exchange instruction xchg [x], EAX gives rise to a pair of read and write events related by a (dashed) rmw edge. Such reads and writes belong to the set A of atomic read-modify-write events.

The relations rf, co, and fr model the communication of instructions via the shared memory (reading from a write, coherence, overwriting a read). Their restrictions rfe, coe, and fre denote (external) communication between instructions from different threads. Relation po is the program order within the same thread and po-loc is its restriction to events addressing the same memory location. Edges of mfence relate events separated by a fence. Further relations are derived from these base relations. To belong to the TSO semantics of the program, an execution has to satisfy the constraints of the memory model: empty rmw ∩ (fre ; coe), which enforces atomicity of read-modify-write events, and the two acyclicity constraints.

Dartagnan encodes the semantics of the given program under the given memory model into an SMT formula. The problem is that each edge (a, b) that may be present in a relation r gives rise to a variable r(a, b). The goal of our relation analysis is to reduce the number of edges that need to be encoded. We illustrate this on the constraint acyclic ghb-tso. The graph next to the program shows the 14 (dotted and solid) edges which may contribute to the relation ghbtso. Of those, only the 6 solid edges can occur in a cycle. The dotted edges can be dropped from the SMT encoding. Our relation analysis determines the solid edges—edges that may have an influence on a constraint of the memory model. Additionally, ghb-tso is a composition of various subrelations (e.g., po-tso or co ∪ fr) that also require encoding into SMT. Relation analysis applies to subrelations as well. Applied to all constraints, it reduces the number of encoded edges for all (sub)relations from 221 to 58.

#### **2 Input, Functionality, and Implementation**

Dartagnan has the ambition of being widely applicable, from assembly over operating system code written in C/C++ to lock-free data structures. The tool accepts programs in PPC, x86, AArch64 assembly, and a subset of C11, all limited to the subsets supported by Herd's .litmus format. It also reads our own .pts format with C11-like syntax [28]. We refer to global variables as memory locations and to local variables as registers. We support pointers, i.e., a register may hold the address of a location. Addresses and values are integers, and we allow the same arithmetic operations for addresses as for regular integer values. Different synchronization mechanisms are available, including variants of readmodify-write, various fences, and RCU instructions [26].

We support the assertion language of Herd. Assertions define inequalities over the values of registers and locations. They come with quantifiers over the reachable states that should satisfy the inequalities.

We use the CAT language [11,12,16] to define memory models. A memory model consists of named relations between events that may occur in an execution. Whether or not an execution is valid is defined by constraints over these relations:

$$\begin{aligned} \langle MM \rangle &::= \langle const \rangle \mid \langle rel \rangle \mid \langle MM \rangle \wedge \langle MM \rangle & \quad \langle r \rangle ::= \langle b \rangle \mid \langle name \rangle \mid \langle r \rangle \cup \langle r \rangle \mid \langle r \rangle \mid \langle r \rangle \\ \langle const \rangle &::= \operatorname{acycl}(\langle r \rangle) \mid \operatorname{irreflexive}(\langle r \rangle) & \quad \mid \langle r \rangle \cap \langle r \rangle \mid \langle r \rangle^{-1} \mid \langle r \rangle^{+} \mid \langle r \rangle^{+} \mid \langle r \rangle; \langle r \rangle \\ & \mid \operatorname{empty}(\langle r \rangle) & \quad \langle b \rangle ::= \operatorname{id} \mid \operatorname{int} \, \texttt{ext} \mid \texttt{pol} \mid \texttt{fencel}(\langle fenc \rangle) \\ \langle rel \rangle & \equiv \langle name \rangle := \langle r \rangle & \quad \mid \texttt{rrow} \, \texttt{cl} \mid \texttt{data} \, \texttt{addr} \mid \texttt{loc} \, \texttt{rf} \mid \texttt{co}. \end{aligned}$$

CAT has a rich relational language, and we only show an excerpt above. Socalled base relations *b* model the control flow, data flow, and synchronization constraints. The language provides intuitive operators to derive further relations. One may define relations recursively by referencing named relations. Their semantics is the least fixpoint.

Dartagnan is invoked with two inputs: the program, annotated with an assertion over the final states, and the memory model. There are two optional parameters related to the verification. The SMT encoding technique for recursive relations is defined by mode chosen between knastertarski (default) and idl (see below). The parameter alias, chosen between none and andersen (default), defines whether to use an alias analysis for our relation analysis (cf. Sect. 3).

Being a bounded model checker, Dartagnan computes an unrolled program with conditionals but no loops. It encodes this acyclic program together with the memory model into an SMT formula and passes it to the Z3 solver. The formula has the form <sup>ψ</sup>*prog* <sup>∧</sup> <sup>ψ</sup>*assert* <sup>∧</sup> <sup>ψ</sup>*mm*, where <sup>ψ</sup>*prog* encodes the program, <sup>ψ</sup>*assert* the assertion, and ψ*mm* the memory model. We elaborate on the encoding of the program and the memory model. The assertion is already given as a formula.

We model the heap by encoding a new memory location for each variable and a set of locations for each memory allocation of an array. Every location has an address encoded as an integer variable whose value is chosen by the solver. In an array, the locations are required to have consecutive addresses. Instances of instructions are modeled as events, most notably stores (to the shared memory) and loads (from the shared memory).

We encode relations by associating pairs of events with Boolean variables. Whether the pair (e1, e2) is contained in relation r is indicated by the variable <sup>r</sup>(e1, e2). Encoding the relations <sup>r</sup><sup>1</sup> <sup>∩</sup> <sup>r</sup>2, <sup>r</sup><sup>1</sup> <sup>∪</sup> <sup>r</sup>2, <sup>r</sup><sup>1</sup> ; <sup>r</sup>2, <sup>r</sup><sup>1</sup> \ <sup>r</sup><sup>2</sup> and <sup>r</sup> <sup>−</sup><sup>1</sup> is straightforward [27]. For recursively defined and (reflexive and) transitive relations, Dartagnan lets the user choose between two methods for computing fixed points by setting the appropriate parameter. The integer-difference logic (IDL) method encodes a Kleene iteration by means of integer variables (one for each pair of events) representing the step in which the pair was added to the relation [27]. The Knaster-Tarski encoding simply looks for a post fixpoint. We have shown in [28] that this is sufficient for reachability analysis.

#### **3 Relation Analysis**

To optimize the size of the encoding (and the solving times), we found it essential to reduce the domains of the relations. We determine for each relation a static over-approximation of the pairs of events that may be in this relation. Even more, we restrict the relation to the set of pairs that may influence a constraint of the given memory model. These restricted sets are the *relation analysis* information (of the program relative to the memory model). Technically, we compute, for each relation r, two sets of event pairs, *M* (r) and *A*(r). The former contains socalled *may pairs*, pairs of events that may be in relation r. This does not yet take into account whether the may pairs occur in some constraint of the memory model. The *active pairs A*(r) incorporate this information, and hence restrict the set of may pairs. As a consequence of the relation analysis, we only introduce Boolean variables <sup>r</sup>(e1, e2) for the pairs (e1, e2) <sup>∈</sup> *<sup>A</sup>*(r) to the SMT encoding.

The algorithm for constructing the may set and the active set is a fixpoint computation. What is unconventional is that the two sets propagate their information in different directions. For *A*(r), the computation proceeds from the constraints and propagates information down the syntax tree of the CAT memory model. The sets *M* (r) are computed bottom-up the syntax tree. Interestingly, in our implementation, we do not compute the full fixpoint but let the top-down process trigger the required bottom-up computation.

Both sets are computed as least solutions to a common system of inequalities. As we work over powerset lattices (relations are sets after all), the order of the system will be inclusion. We understand each set *M* (r) and *A*(r) as a variable, thereby identifying it with its least solution. To begin with, we give the definition for *A*(r). In the base case, we have a relation r that occurs in a constraint of the memory model. The inequality is defined based on the shape of the constraint:

$$A(\mathbf{r}) \supseteq M(\mathbf{r}) \text{ (empty) } \quad A(\mathbf{r}) \supseteq M(\mathbf{r}) \cap \text{id} \text{ (}irref \text{.)} \quad A(\mathbf{r}) \supseteq M(\mathbf{r}) \cap M(\mathbf{r}^+)^{-1} \text{ (}acyclic \text{)}.$$

For the emptiness constraint, all pairs of events that may be contained in the relation are relevant. If the constraint requires irreflexivity, what matters are the pairs (e, e). If the constraint requires acyclicity, we concentrate on the pairs (e1, e2), where (e1, e2) may be in relation r and (e2, e1) may be in relation r +. Note how the definition of active pairs triggers the computation of may pairs.

If the relation in the constraint is a composed one, the following inequalities propagate the information about the active pairs down the syntax tree of the CAT memory model:

$$\begin{array}{lcl} A(\mathbf{r\_1}) & \supseteq & A(\mathbf{r})^{-1} & \text{if } \mathbf{r} = \mathbf{r\_1}^{-1} \\ A(\mathbf{r\_1}) & \supseteq & A(\mathbf{r}) & \text{if } \mathbf{r} = \mathbf{r\_1} \cap \mathbf{r\_2} \text{ or } \mathbf{r} = \mathbf{r\_1} \nmid \mathbf{r\_2} \\ A(\mathbf{r\_1}) & \supseteq & A(\mathbf{r}) \cap M(\mathbf{r\_1}) & \text{if } \mathbf{r} = \mathbf{r\_1} \cup \mathbf{r\_2} \text{ or } \mathbf{r} = \mathbf{r\_2} \nmid \mathbf{r\_1} \\ A(\mathbf{r\_1}) & \supseteq & \{x \in M(\mathbf{r\_1}) \mid x; M(\mathbf{r\_2}) \cap A(\mathbf{r}) \neq \emptyset\} & \text{if } \mathbf{r} = \mathbf{r\_1}; \mathbf{r\_2} \\ A(\mathbf{r\_1}) & \supseteq & \{x \in M(\mathbf{r\_1}) \mid M(\mathbf{r\_1}) \mathbf{); x; M(\mathbf{r\_1}) \cap A(\mathbf{r}) \neq \emptyset\} & \text{if } \mathbf{r} = \mathbf{r\_1}^{+} \text{ or } \mathbf{r} = \mathbf{r\_1}^{+}. \end{array}$$

The definition maintains the invariant *<sup>A</sup>*(r) <sup>⊆</sup> *<sup>M</sup>* (r). If a pair (e1, e2) is relevant to relation r = r −1 <sup>1</sup> , then (e2, e1) will be relevant to r1. We do not have to intersect *<sup>A</sup>*(r)−<sup>1</sup> with *<sup>M</sup>* (r)−<sup>1</sup> because *<sup>A</sup>*(r) <sup>⊆</sup> *<sup>M</sup>* (r) ensures *<sup>A</sup>*(r)−<sup>1</sup> <sup>⊆</sup> *<sup>M</sup>* (r)−<sup>1</sup>. We can avoid the intersection with the may pairs for the next case as well. There, *A*(r) ⊆ *M* (r) holds by the invariant and *M* (r) = *M* (r1)∩*M* (r2) by definition (see below). For union and the other case of subtraction, the intersection with *M* (r1) is necessary. There are symmetric definitions for union and intersection for r2. For a relation r<sup>1</sup> that occurs in a relational composition r = r1;r2, the pairs (e1, e3) become relevant if they may be composed with a pair (e3, e2) in r<sup>2</sup> to obtain a pair (e1, e2) relevant to r. Note that for r<sup>2</sup> we again need the may pairs. The definition for r<sup>2</sup> is similar. The definition for the (reflexive and) transitive closure follows the ideas for relational composition.

The definition of the may sets follows the syntax of the CAT memory model bottom-up. With ⊕ ∈ {∪,∩, ; } and ⊗∈{+, <sup>∗</sup>, <sup>−</sup>1}, we have:

$$M(\mathfrak{r}\_1 \oplus \mathfrak{r}\_2) \supseteq M(\mathfrak{r}\_1) \oplus M(\mathfrak{r}\_2) \qquad M(\mathfrak{r}^{\otimes}) \supseteq M(\mathfrak{r})^{\otimes} \qquad M(\mathfrak{r}\_1 \wr \mathfrak{r}\_2) \supseteq M(\mathfrak{r}\_1).$$

**Fig. 1.** Impact of the unrolling bound (*x*-axis) on the verification time (*y*-axis).

Burns

This simply executes the operator of the relation on the corresponding may sets. Subtraction (r<sup>1</sup> \ r2) is the exception, it is not sound to over-approximate r2.

At the bottom level, the may sets are determined by the base relations. They depend on the shape of the relations and the positions of the events in the control flow. The relations loc, co and rf are concerned with memory accesses. What makes it difficult to approximate these relations is our support for pointers and pointer arithmetic. Without further information, we have to conservatively assume that a memory event may access any address. To improve the precision of the may sets for loc, co, and rf, our fixpoint computation incorporates a *may-alias analysis*. We use a control-flow insensitive Andersen-style analysis [17]. It incurs only a small overhead and produces a close over-approximation of the may sets. The analysis returns<sup>1</sup> a set of pairs of memory events *PTS* <sup>⊆</sup> (W∪R)×(W∪R) such that every pair of events outside *PTS* definitely accesses different addresses. Here, W are the store events in the program and R are the loads. Note that the analysis has to be control-flow insensitive as the given memory model may be very weak [10]. We have *M* (loc) ⊇ *PTS*. Similarly, *M* (co) and *M* (rf) are defined by *PTS* restricted to (<sup>W</sup> <sup>×</sup> <sup>W</sup>) and (<sup>W</sup> <sup>×</sup> <sup>R</sup>), respectively.

We stress the importance of the alias analysis for our relation analysis: loc, co, and rf are frequently used as building blocks of composite relations. Excessive may sets will therefore negatively affect the over-approximations of virtually all relations in a memory model, and keep the overall encoding unnecessarily large.

**Illustration.** We illustrate the relation analysis on the example from the introduction. Consider constraint acyclic ghb-tso. The computation of the active set for the relation ghb-tso triggers the calculation of the may set, following the inequality *<sup>A</sup>*(ghb-tso) <sup>⊇</sup> *<sup>M</sup>* (ghb-tso) <sup>∩</sup> *<sup>M</sup>* (ghb-tso<sup>+</sup>)−<sup>1</sup>. The may set is the union of the may sets for the subrelations, shown by colored (dotted and solid) edges.

<sup>1</sup> This is a simplification, Andersen returns points-to sets, and we check by an intersection *PTS*(*r*1) ∩ *PTS*(*r*2) whether two registers may alias.

**Fig. 2.** Execution times (logarithmic scale) on Linux kernel litmus tests: impact of alias analysis (left) and comparison against Herd (right).

The intersection yields the edges that may lie on cycles of ghb-tso. They are drawn in solid. These solid edges in *A*(ghb-tso) are propagated down to the subrelations. For example, *A*(po-tso) ⊇ *A*(ghb-tso)∩*M* (po-tso) yields the solid black edges.

#### **4 Experiments**

We compare Dartagnan to CBMC [13] and Nidhugg [5,9], both modelspecific tools, and to Herd [4,16] and the Dartagnan FMCAD-18 version [3,28] (without relation analysis), both taking CAT models as inputs. We also evaluate the impact of the alias analysis on the execution time.

**Benchmarks.** For CBMC, Nidhugg, and the FMCAD-18 Dartagnan, we evaluate the performance on 7 mutual exclusion benchmarks executed on TSO (all tools) and a subset of ARMv7 (only Nidhugg and Dartagnan). The results on Power are similar to those on ARM and thus omitted. We excluded Herd from this experiment since it did not scale even for small unrolling bounds [28]. We set a 5 min timeout for Parker, Dekker, and Peterson as this is sufficient to show the trends in the runtimes, and a 30 min timeout for the remaining benchmarks. To compare against Herd, and to evaluate the impact of the alias analysis, we run 4751 Linux kernel litmus tests (all tests from [1] without Linux spinlocks). The tests contain kernel primitives, such as RCU, on the Linux kernel model. We set a 30 min timeout.

**Evaluation.** The times for CBMC, Nidhugg-ARM, and the FMCAD-2018 version of Dartagnan grow exponentially for Parker (see Fig. 1). The growth in CBMC and FMCAD-2018 is due to the explosion of the encoding. For the latter, the solver runs out of memory with unrolling bounds 20 (TSO) and 10 (ARM). For Nidhugg-ARM, the tool explores many unnecessary executions. The verification times for Nidhugg-TSO and the current version of Dartagnan grow linearly. The latter is due to the relation analysis. For Peterson, the results are similar except for CBMC, which matches Dartagnan's performance.

For Dekker, Nidhugg outperforms both CBMC and Dartagnan. This is because the number of executions grows slowly compared to the explosion of the number of instructions. The executions in both memory models coincide, making the performance on ARM comparable to that on TSO for Nidhugg. The difference is due to the optimal exploration in TSO, but not in ARM. Relation analysis has some impact on the performance (see FMCAD-2018 vs. Dartagnan), but the encoding size still grows faster than the number of executions.

The benchmarks Burns, Bakery, and Lamport demonstrate the opposite trend: the number of executions grows much faster than the size of the encoding. Here, CBMC and Dartagnan outperform Nidhugg. Notice that for Burns, Nidhugg performs better on ARM than on TSO with unrolling bound 5. This is counter-intuitive since one expects more executions on ARM. Although the number of executions coincide, the exploration time is higher on TSO due to a different search algorithm. For Szymanski, similar results hold except for Dartagnan-ARM where the encoding grows exponentially.

Figure <sup>2</sup> (left) shows the verification times for the current version of Dartagnan with and without alias analysis. The alias analysis results in a speed-up of more than two orders of magnitude in benchmarks with several threads accessing up to 18 locations. Figure <sup>2</sup> (right) compares the performance of Dartagnan against Herd. We used the Knaster-Tarski encoding and alias analysis since they yield the best performance. Herd outperforms Dartagnan on small test instances (less than 1 s execution time). This is due to the JVM startup time and the preprocessing costs of Dartagnan. However, on large benchmarks, Herd times out while Dartagnan takes less than 10 s.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **When Human Intuition Fails: Using Formal Methods to Find an Error in the "Proof" of a Multi-agent Protocol**

Jennifer A. Davis<sup>1</sup>, Laura R. Humphrey2(B) , and Derek B. Kingston<sup>3</sup>

> <sup>1</sup> Collins Aerospace, Cedar Rapids, IA 52498, USA jen.davis@collins.com <sup>2</sup> Air Force Research Lab, Dayton, OH 45433, USA laura.humphrey@us.af.mil <sup>3</sup> Aurora Flight Sciences, Manassas, VA 20110, USA kingston.derek@aurora.aero

**Abstract.** Designing protocols for multi-agent interaction that achieve the desired behavior is a challenging and error-prone process. The standard practice is to manually develop proofs of protocol correctness that rely on human intuition and require significant effort to develop. Even then, proofs can have mistakes that may go unnoticed after peer review, modeling and simulation, and testing. The use of formal methods can reduce the potential for such errors. In this paper, we discuss our experience applying model checking to a previously published multi-agent protocol for unmanned air vehicles. The original publication provides a compelling proof of correctness, along with extensive simulation results to support it. However, analysis through model checking found an error in one of the proof's main lemmas. In this paper, we start by providing an overview of the protocol and its original "proof" of correctness, which represents the standard practice in multi-agent protocol design. We then describe how we modeled the protocol for a three-vehicle system in a model checker, the counterexample it returned, and the insight this counterexample provided. We also discuss benefits, limitations, and lessons learned from this exercise, as well as what future efforts would be needed to fully verify the protocol for an arbitrary number of vehicles.

**Keywords:** Multi-agent systems · Distributed systems · Autonomy · Model checking

#### **1 Introduction**

Many robotics applications require multi-agent interaction. However, designing protocols for multi-agent interaction that achieve the desired behavior can be

This is a U.S. government work and not under copyright protection in the U.S.; foreign copyright protection may apply 2019 I. Dillig and S. Tasiran (Eds.): CAV 2019, LNCS 11561, pp. 366–375, 2019. https://doi.org/10.1007/978-3-030-25540-4\_20

D. B. Kingston—Supported by AFRL/RQ contract #FA8650-17-F-2220 and AFOSR award #17RQCOR417. DISTRIBUTION A. Approved for public release: distribution unlimited. Case #88ABW-2018-4275.

challenging. The design process is often manual, i.e. performed by humans, and generally involves creating mathematical models of possible agent behaviors and candidate protocols, then manually developing a proof that the candidate protocols are correct with respect to the desired behavior. However, human-generated proofs can have mistakes that may go unnoticed even after peer review, modeling and simulation, and testing of the resulting system.

Formal methods have the potential to reduce such errors. However, while the use of formal methods in multi-agent system design is increasing [2,6,8,11], it is our experience that manual approaches are still the norm. Here, we hope to motivate the use of formal methods for multi-agent system design by demonstrating their value in a case study involving a manually designed decentralized protocol for dividing surveillance of a perimeter across multiple unmanned aerial vehicles (UAVs). This protocol, called the Decentralized Perimeter Surveillance System (DPSS), was previously published in 2008 [10], has received close to 200 citations to date, and provides a compelling "proof" of correctness backed by extensive simulation results.

We start in Sect. 2 by giving an overview of DPSS, the convergence bounds that comprise part of its specification, and the original "proof" of correctness. In Sect. 3, we give an overview of the three-UAV DPSS model we developed in the Assume Guarantee REasoning Environment (AGREE) model checker [3]. In Sect. 4, we present the analysis results returned by AGREE, including a counterexample to one of the convergence bounds. Section 5 concludes with a discussion of benefits, challenges, and limitations of our modeling process and how to help overcome them, and what future work would be required to modify and fully verify DPSS for an arbitrary number of UAVs.

#### **2 Decentralized Perimeter Surveillance System (DPSS)**

UAVs can be used to perform continual, repeated surveillance of a large perimeter. In such cases, more frequent coverage of points along the perimeter can be achieved by evenly dividing surveillance of it across multiple UAVs. However, coordinating this division is challenging in practice for several reasons. First, the exact location and length of the perimeter may not be known a priori, and it may change over time, as in a growing forest fire or oil spill. Second, UAVs might go offline and come back online, e.g. for refueling or repairs. Third, inter-UAV communication is unreliable, so it is not always possible to immediately communicate local information about perimeter or UAV changes. However, such information is needed to maintain an even division of the perimeter as changes occur. DPSS provides a method to solve this problem with minimal inter-UAV communication for perimeters that are isomorphic to a line segment.

Let the perimeter start as a line segment along the x-axis with its left endpoint at x = 0 and its right at x = P. Let N be the number of UAVs in the system or on the "team," indexed from left to right as 1,...,N. Divide the perimeter into segments of length P/N, one per UAV. Then the optimal configuration of DPSS as depicted in Fig. 1 is defined as follows (see Ref. [10] for discussion of why this definition is desirable).

**Definition 1.** *Consider two sets of perimeter locations: (1)* i + <sup>1</sup> <sup>2</sup> (−1)*<sup>i</sup>* P/N *and (2)* <sup>i</sup><sup>−</sup> <sup>1</sup> <sup>2</sup> (−1)*<sup>i</sup>* P/N*, where* -· *returns the largest integer less than or equal to its argument. The optimal configuration is realized when UAVs synchronously oscillate between these two sets of locations, each moving at constant speed* V *.*

**Fig. 1.** Optimal DPSS configuration, in which UAVs are evenly spaced along the perimeter and synchronously oscillate between segment boundaries.

The goal of DPSS is to achieve the optimal configuration in the steady state, i.e. when the perimeter and involved UAVs remain constant. The DPSS protocol itself is relatively simple. Each UAV i stores a vector ξ*<sup>i</sup>* = [P*<sup>R</sup><sup>i</sup>* P*<sup>L</sup><sup>i</sup>* N*<sup>R</sup><sup>i</sup>* N*<sup>L</sup><sup>i</sup>* ] *T* of coordination variables that capture its beliefs (which may be incorrect) about perimeter length P*<sup>R</sup><sup>i</sup>* and P*<sup>L</sup><sup>i</sup>* and number of UAVs N*<sup>R</sup><sup>i</sup>* and N*<sup>L</sup><sup>i</sup>* to its right and left. When neighboring UAVs meet, "left" UAV i learns updated values for its "right" variables P- *<sup>R</sup><sup>i</sup>* = P*<sup>R</sup>i*+1 and N- *<sup>R</sup><sup>i</sup>* = N*<sup>R</sup>i*+1 + 1 from "right" UAV i + 1, and likewise UAV i + 1 updates its "left" variables P- *<sup>L</sup>i*+1 = P*<sup>L</sup><sup>i</sup>* and N- *<sup>L</sup>i*+1 = N*<sup>L</sup><sup>i</sup>* + 1. While values for these variables may still be incorrect, the two UAVs will at least have matching coordination variables and thus a consistent estimate of their shared segment boundary. The two UAVs then "escort" each other to their estimated shared segment boundary, then split apart to surveil their own segment. Note that UAVs only change direction when they reach a perimeter endpoint or when starting or stopping an escort, which means a UAV will travel outside its segment unless another UAV arrives at the segment boundary at the same time (or the end of the segment is a perimeter endpoint).

Eventually, leftmost UAV 1 will discover the actual left perimeter endpoint, accurately set N*<sup>L</sup>*<sup>1</sup> = 0 and P*<sup>L</sup>*<sup>1</sup> = 0, then turn around and update P*<sup>L</sup>*<sup>1</sup> continuously as it moves. A similar situation holds for rightmost UAV n. Accurate information will be passed along to other UAVs as they meet, and eventually all UAVs will have correct coordination variables and segment boundary estimates. Since UAVs also escort each other to shared segment boundaries whenever they meet, eventually the system reaches the optimal configuration, in which UAVs oscillate between their true shared segment boundaries.

An important question is how long it takes DPSS to converge to the optimal configuration. Each time the perimeter or number of UAVs changes, it is as if the system is reinitialized; UAVs no longer have correct coordination variables and so the system is no longer converged. However, if DPSS is able to re-converge relatively quickly, it will often be in its converged state.

Ref. [10] claims that DPSS converges within 5T, where T = P/V is the time it would take a single UAV to traverse the entire perimeter if there were no other UAVs in the system. It describes DPSS as two algorithms: Algorithm A, in which UAVs start with correct coordination variables, and Algorithm B, in which they do not. The proof strategy is then to argue that Algorithm A converges in 2T (Theorem 1) and Algorithm B achieves correct coordination variables in 3T (Lemma 1)<sup>1</sup>. At that point, Algorithm B converts to Algorithm A, so the total convergence time is 2T + 3T = 5T (Theorem 2)<sup>2</sup>.

**Fig. 2.** Claimed worst-case coordination variable convergence for Algorithm B.

Informally, the original argument for Lemma 1 is that information takes time T to travel along the perimeter. The worst case occurs when all UAVs start near one end of the perimeter, e.g. the left endpoint, so that the rightmost UAV N reaches the right endpoint around time T. UAV N then turns around and through a fast series of meetings, correct "right" coordination variables are propagated to the other UAVs, all of which then start moving left. Due to incorrect "left" coordination variables, UAV N − 1 and UAV N might think their shared segment boundary is infinitesimally close to the left endpoint. The UAVs travel left until they are almost at the left perimeter endpoint around time 2T. However, since UAV N thinks its segment boundary is near the left endpoint, it ends its escort and goes right without learning the true location of the left perimeter endpoint. Leftmost UAV 1 learns the true location of the left perimeter endpoint and this information will be passed to the other UAVs as they meet, but the information will have to travel the perimeter once again to reach the rightmost UAV N around time 3T. This situation is depicted in Fig. 2.

Through model checking, we were able to find a counterexample to this claimed bound, which will be presented in Sect. 4. But first, we overview the model used for analysis through model checking.

#### **3 Formal Models**

We briefly overview the formal models developed in AGREE for a three-UAV version of DPSS as described by Algorithm B. Models for Algorithm A and

<sup>1</sup> We label this Lemma 1 for convenience; it is unlabeled in [10].

<sup>2</sup> A version of the original proof is on GitHub [1] in file dpssOriginalProof.pdf.

Algorithm B along with a more detailed description of the Algorithm B model are available on GitHub [1] 3.

AGREE is an infinite-state model checker capable of analyzing systems with real-valued variables, as is the case with DPSS. AGREE uses assume/guarantee reasoning to verify properties of architectures modeled as a top-level system with multiple lower-level components, each having a formally specified assume/guarantee contract. Each contract consists of a set of assumptions on the inputs and guarantees on the outputs, where inputs and outputs can be reals, integers, or booleans. System assumptions and component assume/guarantee contracts are assumed to be true. AGREE then attempts to verify that (a) component assumptions hold given system assumptions, and (b) system guarantees hold given component guarantees. AGREE poses this verification problem as a satisfiability modulo theory (SMT) problem [4] and uses a k-induction model checking approach [7] to search for counterexamples that violate system-level guarantees given system-level assumptions and component-level assume/guarantee contracts. The language used by AGREE is an "annex" to the Architecture Analysis and Design Language (AADL) [5].

AGREE's ability to analyze systems modeled as a top-level system with multiple lower-level components provides a natural fit for DPSS. The three-UAV AGREE DPSS model consists of a single top-level system model, which we call the "System," and a component-level UAV model that is instantiated three times, which we call the "UAV(s)." The System essentially coordinates a discrete event simulation of the UAVs as they execute the DPSS protocol, where events include a UAV reaching a perimeter endpoint or two UAVs starting or stopping an escort. In the initial state, the System sets valid ranges for each UAV's initial position through assumptions that constrain the UAVs to be initialized between the perimeter endpoints and ordered by ID number from left to right. System assumptions also constrain UAV initial directions to be either left or right (though a UAV might have to immediately change this value, e.g., if it is initialized at the left endpoint headed left). These values become inputs to the UAVs. The System determines values for other UAV inputs, including whether a UAV is co-located with its right or left neighbor and the true values for the left and right perimeter endpoints. Note the true perimeter endpoints are only used by the UAVs to check whether they have reached the end of the perimeter, not to calculate boundary segment endpoints. The System also establishes data ports between UAVs, so that each UAV can receive updated coordination variable values from its left or right neighbor as inputs and use them (but only if they are co-located).

The last System output that serves as a UAV input is the position of the UAV. At initialization and after each event, the System uses the globally known constant UAV speed V and other information from each UAV to determine the amount of time δt until the next event, and then it updates the position of each

<sup>3</sup> AADL projects are in AADL sandbox projects. Algorithm A and B models for three UAVs are in projects DPSS-3-AlgA-for-paper and DPSS-3-AlgB-for-paper. A description of the Algorithm B model is in file modelAlgorithmB.pdf.

UAV. Determining the time of the next event requires knowing the direction and next anticipated "goal" location of each UAV, e.g. estimated perimeter endpoint or shared segment boundary. Each UAV outputs these values, which become inputs to the System. Each UAV also outputs its coordination variables P*R<sup>i</sup>* , P*L<sup>i</sup>* , N*R<sup>i</sup>* , and N*L<sup>i</sup>* , which become System inputs that are used in System guarantees that formalize Theorem 1, Lemma 1, and Theorem 2 of Sect. 2. Note that we bound integers N*R<sup>i</sup>* and N*L<sup>i</sup>* because in order to calculate estimated boundary segments, which requires dividing perimeter length by the number of UAVs, we must implement a lookup table that copies the values of N*<sup>R</sup><sup>i</sup>* and N*<sup>L</sup><sup>i</sup>* to realvalued versions of these variables. This is due to an interaction between AGREE and the Z3 SMT solver [4] used by AGREE. If we directly cast N*<sup>R</sup><sup>i</sup>* and N*<sup>L</sup><sup>i</sup>* to real values in AGREE, they are encoded in Z3 using the to real function. Perimeter values P*<sup>R</sup><sup>i</sup>* and P*<sup>L</sup><sup>i</sup>* are directly declared as reals. However, Z3 views integers converted by the to real function as constrained to have integer values, so it cannot use the specialized solver for reals that is able to analyze this model.

#### **4 Formal Analysis Results**

In this section, we discuss the analysis results provided by AGREE for Algorithm A and Algorithm B, though we focus on Algorithm B.

**Algorithm A**: Using AGREE configured to utilize the JKind k-induction model checker [7] and the Z3 SMT solver, we have proven Theorem 1, that Algorithm A converges within 2T, for N = 1, 2, 3, 4, 5, and 6 UAVs. Computation time prevented us from analyzing more than six UAVs. For reference, N = 1 through N = 4 ran in under 10 min each on a laptop with two cores and 8 GB RAM. The same laptop analyzed N = 5 overnight. For N = 6, the analysis took approximately twenty days on a computer with 40 cores and 128 GB memory.

**Algorithm B**: We were able to prove Theorem 2, that DPSS converges within 5T, for N = 1, 2, and 3 UAVs and with each UAV's coordination variables N*<sup>R</sup><sup>i</sup>* and N*<sup>L</sup><sup>i</sup>* bounded between 0 and 20. In fact, we found the convergence time to be within (4 + <sup>1</sup> <sup>3</sup>T). However, AGREE produced a counterexample to Lemma 1, that every UAV obtains correct coordination variables within 3T, for N = 3. In fact, we incrementally increased this bound and found counterexamples up to (3 + <sup>1</sup> <sup>2</sup> )<sup>T</sup> but that convergence is guaranteed in (3 + <sup>2</sup> <sup>3</sup> )T.

One of the shorter counterexamples provided by AGREE shows the UAVs obtaining correct coordination variables in 3.0129T. Full details are available on GitHub [1],<sup>4</sup> but we outline the steps in Fig. 3. In this counterexample, UAV 1 starts very close to the left perimeter heading right, and UAVs 2 and 3 start in the middle of segment 3 headed left. UAVs 1 and 2 meet near the middle of the perimeter and head left toward what they believe to be their shared segment boundary. This is very close to the left perimeter endpoint because, due to initial conditions, they believe the left perimeter endpoint to be much

<sup>4</sup> A spreadsheet with counterexample values for all model variables is located under AADL sandbox projects/DPSS-3-AlgB-for-paper/results 20180815 eispi.

farther away than it actually is. Then they split, and UAV 1 learns where the left perimeter endpoint actually is, but UAV 2 does not. UAV 2 heads right and meets UAV 3 shortly afterward, and they move to what they believe to be their shared segment boundary, which is likewise very close to the right perimeter endpoint. Then they split, and UAV 3 learns where the right perimeter endpoint is, but UAV 2 does not. UAV 2 heads left, meets UAV 1 shortly after, and learns correct "left" coordination variables. However, UAV 2 still believes the right perimeter endpoint to be farther away than it actually is, so UAV 1 and 2 estimate their shared segment boundary to be near the middle of the perimeter. They then head toward this point and split apart, with UAV 1 headed left and still not having correct "right" coordination variables. UAV 2 and 3 then meet, exchange information, and now both have correct coordination variables. They go to their actual shared boundary, split apart, and UAV 2 heads left toward UAV 1. UAV 1 and 2 then meet on segment 1, exchange information, and now all UAVs have correct coordination variables.

The counterexample reveals a key intuition that was missing in Lemma 1. The original argument did not fully consider the effects of initial conditions and so only considered a case in which UAVs came close to *one* end of the perimeter without actually reaching it. The counterexample shows it can happen at *both* ends if initial conditions cause the UAVs to believe the perimeter endpoints to be farther away than they actually are. This could happen if the perimeter were to quickly shrink, causing the system to essentially "reinitialize" with incorrect coordination variables.

**Fig. 3.** Counterexample to Lemma 1. Dots to the left of a UAV number indicate it has correct "left" variables, and likewise for the right.

Analysis for three UAVs for Algorithm B completed in 18 days on a machine with 256 GB RAM and 80 cores.

#### **5 Discussion and Conclusions**

Formal modeling and analysis through AGREE had many benefits. First, it allowed us to analyze DPSS, a decentralized protocol for distributing a surveillance task across multiple UAVs. Though the original publication on DPSS provided a convincing human-generated proof and simulation results to support claims about its convergence bounds, analysis revealed that one of the key lemmas was incorrect. Furthermore, the counterexample returned by AGREE provided insight into why it was incorrect. Second, formal modeling in and of itself allowed us to find what were essentially technical typos in the original paper. For example, the formula for dividing the perimeter across UAVs only accounted for changes in estimates of the right perimeter endpoint and not the left, so we corrected the formula for our model. We also discovered that certain key aspects of the protocol were underspecified. In particular, it is unclear what should happen if more than two UAVs meet at the same time. Analysis showed this occurring for as little as three UAVs in Algorithm B, and simulations in the original paper showed this happening frequently, but this behavior was not explicitly described. Here, we decided that if all three UAVs meet to the left of UAV 3's estimated segment, UAV 3 immediately heads right and the other two follow the normal protocol to escort each other to their shared border. Otherwise, the UAVs all travel left together to the boundary between segments 2 and 3, then UAV 3 breaks off and heads right while the other two follow the normal protocol.

This brings us to a discussion of challenges and limitations. First, in terms of more than two UAVs meeting at a time, simulations in the original paper implement a more complex behavior in which UAVs head to the closest shared boundary and then split apart into smaller and smaller groups until reaching the standard case of two co-located UAVs. This behavior requires a more complex AGREE model that can track "cliques" of more than two UAVs, and it is difficult to validate the model due to long analysis run times. Second, we noted in Sect. 4 that in our model, UAV coordination variables N*<sup>R</sup><sup>i</sup>* and N*<sup>L</sup><sup>i</sup>* have an upper bound of 20. In fact, with an earlier upper bound of 3, we found the bound for Lemma 1 to be (3 + <sup>1</sup> <sup>3</sup> )T and did not consider that it would depend on upper bounds for N*<sup>R</sup><sup>i</sup>* and N*<sup>L</sup><sup>i</sup>* . We therefore cannot conclude that even (3 + <sup>2</sup> <sup>3</sup> )T is the convergence time for Lemma 1. Third and related to the last point, model checking with AGREE can only handle up to three UAVs for Algorithm B. Due to these limitations, we cannot say for sure what the upper bound for DPSS actually is, even if we believe it to be 5T. If it is higher, then it takes DPSS longer to converge, meaning it can handle less frequent changes than originally believed. We are therefore attempting to transition to theorem provers such as ACL2 [9] and PVS [12] to develop a proof of convergence bounds for an arbitrary number of UAVs, upper bound on N*<sup>R</sup><sup>i</sup>* and N*<sup>L</sup><sup>i</sup>* , and perimeter length (which was set to a fixed size to make the model small enough to analyze).

In terms of recommendations and lessons learned, it was immensely useful to work with the author of DPSS to formalize our model. Multi-agent protocols like DPSS are inherently complex, and it is not surprising that the original paper contained some typos, underspecifications, and errors. In fact, the original paper explains DPSS quite well and is mostly correct, but it is still challenging for formal methods experts to understand complex systems from other disciplines, so access to subject matter experts can greatly speed up formalization.

**Acknowledgment.** We thank John Backes for his guidance on efficiently modeling DPSS in AGREE and Aaron Fifarek for running some of the longer AGREE analyses.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Extending NUXMV with Timed Transition Systems and Timed Temporal Properties**

Alessandro Cimatti, Alberto Griggio,

Enrico Magnago, Marco Roveri(B) , and Stefano Tonetta

Fondazione Bruno Kessler, Trento, Italy roveri@fbk.eu

**Abstract.** NUXMV is a well-known symbolic model checker, which implements various state-of-the-art algorithms for the analysis of finite- and infinite-state transition systems and temporal logics. In this paper, we present a new version that supports timed systems and logics over continuous super-dense semantics. The system specification was extended with clocks to constrain the timed evolution. The support for temporal properties has been expanded to include MTL0*,*<sup>∞</sup> formulas with parametric intervals. The analysis is performed via a reduction to verification problems in the discrete-time case. The internal representation of traces has been extended to go beyond the lasso-shaped form, to take into account the possible divergence of clocks. We evaluated the new features by comparing NUXMV with other verification tools for timed automata and MTL0*,*∞, considering different benchmarks from the literature. The results show that NUXMV is competitive with and in many cases performs better than state-of-the-art tools, especially on validity problems for MTL0*,*∞.

#### **1 Introduction**

NUXMV [1] is a symbolic model checker for the analysis of synchronous finite- and infinite-state transition systems. For the finite-state case, NUXMV features strong verification engines based on state-of-the-art SAT-based algorithms. For the infinite-state case, NUXMV features SMT-based verification techniques, implemented through a tight integration with the MATHSAT5 solver [2]. NUXMV has taken part to recent editions of the hardware model checking competition, where it has shown to be very competitive with the state-of-the-art. NUXMV also compares well with other model checkers for infinite-state systems. Moreover, it has been successfully used in several application domains both in research and industrial settings. It is currently the core verification engine for many other tools (also industrial ones) for requirements analysis, contract based design, model checking of hybrid systems, safety assessment, and software model checking.

In this paper, we put emphasis on the novel extensions to NUXMV to support timed synchronous transition systems, which extend symbolically-represented infinite-state transition systems with clocks. The main novelties of this new version are the following. The NUXMV input language was extended to enable the description of symbolic

**Fig. 1.** The high level architecture of NUXMV.

synchronous timed transition systems with super-dense time semantics (where signals can have a sequence of values at any real time t). The support for temporal properties has been expanded to include MTL0*,*<sup>∞</sup> formulas with parametric intervals [3,4]. Therefore, NUXMV now supports model checking of invariant, LTL and MTL0*,*<sup>∞</sup> properties over (symbolic) timed transition systems, as well as validity/satisfiability checking of LTL and MTL0*,*<sup>∞</sup> formulas. This is done via a correct and complete reduction to verification problems in the discrete-time case (thus allowing for the use of mature and efficient verification engines). In order to represent and find infinite traces where clocks may diverge, we extended the representation for lasso-shape traces (over discrete semantics) and we modified the bounded model checking algorithm to properly encode timed traces. We remark that, NUXMV is more expressive than timed automata, since the native management of time is added on top of an infinite state transition system. This makes it straightforward to encode stopwatches and comparison between clocks. We carried out an experimental evaluation comparing NUXMV with other state-of-theart verification tools for timed automata, considering different benchmarks taken from competitor tools distributions.

#### **2 Software Architecture**

The high level architecture of NUXMV is depicted in Fig. 1. For symbolic transition systems NUXMV behaves like the previous version of the system [1], thus allowing for full backward compatibility (apart from some new reserved keywords). It provides the user with all the basic model checking algorithms for finite domains both using BDDs (using CUDD [5]) and SAT (e.g. MINISAT [6]). It supports various SMT-based model checking algorithms (implemented through a tight integration with the MATHSAT5 solver [2]) for the analysis of finite and infinite state systems (e.g. IC3 [7–9], k-liveness [10], liveness to safety [11]). We refer the reader to [1] for a thorough discussion of these consolidated functionalities for the discrete-time setting.

**Fig. 2.** A simple TIMED-NUXMV program.

To support the specification and model checking of invariant, LTL and MTL0*,*<sup>∞</sup> properties for timed transitions systems, and for the validity checking of properties over dense time semantics, NUXMV has been extended w.r.t. [1] as discussed here after.


– We modified the encoding for the loops in the bounded model checking algorithms to take into account that traces may contain diverging variables to allow for the verification and validation of LTL and MTL0*,*<sup>∞</sup> properties.

For portability, NUXMV has been developed mainly in standard C with some new parts in standard C++. It compiles and executes on Linux, MS Windows, and MacOS.

#### **3 Language Extensions**

**Timed Transition Systems.** Discrete-time transition systems are described in NUXMV by a set V of variables, an initial condition I(V ), a transition condition T(V,V ) and an invariant condition Z(V ). Variables are introduced with the keyword VAR and can have type Boolean, scalar, integer, real or array. The initial and the invariant conditions are introduced with the keyword INIT and INVAR and are expressions over the variables in V . The transition condition is introduced with TRANS and is an expression over variables in V and V , where for each variable v in V , V contains the "next" version denoted in the language by next(v). Expressions may use standard symbols in the theory associated to the variable types and user-defined rigid functions that are declared with the keyword FUN.

The input language of NUXMV has been extended to allow the specification of timed transition systems (TTS), which are enabled by the annotation @TIME DOMAIN continuous at the beginning of a model description.

Besides the standard types, in the timed case, state variables can be declared of type clock. All variables of type different from clock are discrete variables.

The language provides a built-in clock variable, accessible through the reserved keyword time. It represents the amount of time elapsed from the initial state until now. time is initialized to 0 and its value does not change in discrete transitions. While all other clock variables can be used in any expression in the model definition, time can be used only in comparison with constants.

Initial, transition, and invariant conditions are specified in NUXMV with the keywords INIT, TRANS, and INVAR, as in the discrete case. In particular, TRANS allows to specify "arbitrary" clock resets. Like all other NUXMV state variables, if a clock is not constrained during a discrete transition, its next value is chosen non-deterministically.

Clock variables can be used in INVAR only in the form ϕ → φ, where ϕ is a formula built using only the discrete variables and φ is convex over the clock variables. This closely maps the concept of location invariant described for timed automata: all locations satisfying ϕ have invariant φ.

An additional constraint, not allowed in the discrete-time case, is introduced with the keyword URGENT followed by a predicate over the discrete variables, which allows to specify a set of locations in which time cannot elapse.

*Comparison with Timed Automata.* Timed automata can be represented by TTSs by simply introducing a variable representing the locations of the automaton. Note that, in TTS, it is possible to express any kind of constraint over clock variables in discrete transitions, while in timed automata it is only possible to reset them to 0 in transitions or compare them to constants in guards. Moreover, the discrete variables of a timed automaton always have finite domain, while in TTSs, also the discrete variables might have an infinite domain. This additional expressiveness allows to describe more complex behaviors (e.g. it is straightforward to encode stopwatches and comparison between clocks) losing the decidability of the model checking problem.

**Specifications.** NUXMV's support for LTL has been extended to allow for the use of MTL0*,*<sup>∞</sup> operators [12] and other operators such as event-freezing functions [13] and dense version of LTL <sup>X</sup> and <sup>Y</sup> operators. MTL0*,*<sup>∞</sup> bounded operators extend the LTL ones of NUXMV to allow for bounds either of the form [c,∞), where c is a constant greater or equal to 0, e.g. F[0,+oo) ϕ, or generic expressions over parametric/frozen variables: e.g. F [0, 3+v] ϕ where v is a frozen variable.

In timed setting, next and previous operators come in two possible versions. The standard LTL operators X and Y require to hold, respectively after and before, a discrete transition. Dually, X˜ and Y˜ have been introduced to allow to predicate about the evolution over time of the system. They are always FALSE in discrete steps and hold in time elapses if the argument holds in the open interval immediately after/before (resp.) the current step. The disjunction <sup>X</sup>(ϕ) ∨ X˜(ϕ) allows to check if the argument ϕ holds after the current state without distinction between time or discrete evolution.

The event-freezing operators *at next* and *at last*, written @F˜ and @O˜, are binary operators allowed in LTL specifications. The left-hand side is a term, while the righthand side is a temporal formula. They return the value of the term respectively at the next and at the last point in time in which the formula is true. If the formula will [has] never happen [happened] the operator evaluates to a default value.

time until and time since are two additional unary operators that can be used in LTL specifications of timed models. Their argument must be a Boolean predicate over current and next variables. time until(ϕ) evaluates to the amount of time elapse required to reach the next state in which ϕ holds, while time since(ϕ) evaluates to the amount of time elapsed from the last state in which ϕ held. As for the @F˜ and @O˜ operators if no such state exists they are assigned to a default value.

#### **4 Extending Traces**

**Timed Traces.** The semantics of NUXMV has been extended to take into account the timing aspects in case of super-dense time. While in the discrete time case, the execution trace is given by a sequence of states connected by discrete transitions (i.e., satisfying the transition condition), in the super-dense time case the execution trace is such that every pair of consecutive states is a discrete or a timed transition. As in the discrete case, discrete transitions are pair of states satisfying the transition condition. As in timed automata, in a timed transition time elapses for a certain amount (referred to as delta time), clocks increase of the same amount, while discrete variables do not change.

**Lasso-Shaped Traces with Diverging Variables.** Traditionally, the only infinite paths supported by NUXMV have been those in lasso shape, i.e. those traces which can be represented by a finite prefix s0, s1,...,s*<sup>l</sup>* (called the stem) followed by a finite suffix s*l*+1,...,s*<sup>k</sup>* ≡ s*<sup>l</sup>* (called the loop), which can be repeated infinitely many times. While this representation is sufficient for finite-state systems (because in a finite-state setting if a system does not satisfy an LTL property, then a lasso-shaped counter-example trace is guaranteed to exist), this is an important limitation in an infinite-state context, in which lasso-shaped counter-examples are not guaranteed to exist. (As a simple example, consider a system <sup>M</sup> := {x},(<sup>x</sup> = 0),(x <sup>=</sup> <sup>x</sup> + 1) in which <sup>x</sup> <sup>∈</sup> <sup>Z</sup>. Then <sup>M</sup> <sup>|</sup><sup>=</sup> **GF**(x = 0), but clearly M has no lasso-shaped trace). In fact, this is especially relevant for timed transition systems, which, by the presence of the always-diverging variable time, admit *no* lasso-shaped trace.

In order to overcome this limitation, we introduce new kinds of infinite traces, which we call *lasso-shape traces with diverging variables* (to allow also for representing traces with variables whose value might be diverging). We modified the bounded model checking algorithms to leverage on this new representation to then extend the capabilities to find witnesses for a given property. This representation significantly extends the capabilities of NUXMV to find witnesses for violated LTL and MTL properties on timed transition systems (see experimental evaluation).

**Definition 1.** *Let* π := s0, s1,...,s*l*,... *be an infinite trace of a system* M *over variables* V *. We say that* π *is a* lasso-shaped trace with diverging variables *iff there exist indexes* 0 ≤ l ≤ k*, a partitioning of* V *into sets* X *and* Y *(*V = X Y *) and an expression* f*y*(V ) *over* V *for every variable* y ∈ Y *such that, for every* i>k*,*

$$s\_i(v) := \begin{cases} s\_{l + ((i-l)\bmod{(k-l)})}(v) \text{ if } v \in X \text{ (like in lasso-shaped traces);}\\f\_v(s\_{i-1}) & \text{if } v \in Y \text{ (as function of previous state).} \end{cases}$$

Intuitively, the idea of lasso-shaped traces with diverging variables is to provide a finite representation for infinite traces that is more general then simple lasso-shaped ones, and which allows to capture more interesting behaviors of timed transition systems.

*Example 1.* Consider the system M := {y, b},¬b ∧ y = 0,(b = ¬b) ∧ (b → y = y+1)∧(¬b → y = y). Then one lasso-shaped trace for M is given by: π := s0, s1, s2, where s<sup>0</sup> := {b → ⊥, y → 0}, s<sup>1</sup> := {b → , y → 0}, and s<sup>2</sup> := {b → ⊥, y → 1}; the trace is lasso-shaped with diverging variables considering Y := {y}; the loop-back at index 0, and f*y*(b, y) := b ? y +1 : y.

*Extended BMC for Traces with Divergent Clocks.* The definition above requires the existence of the functions f*<sup>y</sup>* for computing the updates of diverging variables. In case y is a clock variable, we can define a region φ*y* in which y can diverge (i.e., f*<sup>y</sup>* = y+δ, where δ is the delta time variable).

In order to capture lasso-shaped traces with diverging variables, we can modify the BMC encoding as follows. Let *<sup>k</sup> <sup>l</sup>*=0( *<sup>v</sup>*∈*X<sup>Y</sup>* (v*<sup>l</sup>* <sup>=</sup> <sup>v</sup>*<sup>k</sup>*) <sup>∧</sup> *<sup>l</sup>*ϕ<sup>0</sup> *<sup>k</sup>*) be the formula representing the BMC encoding of [14] at depth k with all possible loop-backs 0 ≤ l ≤ k for a given formula ϕ. The encoding is extended as follows:

$$\bigvee\_{l=0}^{k} \left( \left( \bigwedge\_{x \in X} (x^l = x^k) \land \bigwedge\_{y \in Y} (y^l = y^k \lor \bigwedge\_{i=l}^{k} [\phi\_y]\_i) \right) \land\_l [\![\varphi]\!]\_k^0 \right) \dots$$

The correctness of the encoding relies on a safe choice of the set Y , falling back to the incomplete lasso-shaped case when some syntactic restrictions on the expressions containing clocks are not met (see appendix for more details).

#### **5 Related Work**

There are many tools that allow for the specification and verification of infinite state symbolic synchronous transition systems. Given the focus of this paper, here we restrict our attention to tools supporting timed systems and/or MTL properties.

Uppaal [15], the reference tool for timed systems verification, supports only bounded variable types and therefore finite asynchronous TTS. Properties are limited to a subset of the branching-time logic TCTL [16,17]. LTSmin [18] and Divine [19] are two model checkers that support the Uppaal specification language and properties specified in LTL. RTD-Finder [20] handles only safety properties for real-time componentbased systems specified in RT-BIP. The verification is based on a compositional computation of an invariant over-approximating the set of reachable states of the system and leverages on counterexample-based invariant refinement algorithm. The ZOT Bounded Model/Satisfiability Checker [21] supports different logic languages through a multilayered approach based on LTL with past operators. Similarly to NUXMV, ZOT supports dense-time MTL. It leverages only on SMT-based Bounded Model Checking, and is therefore unable to prove that properties hold. Atmoc [22] implements an extension of IC3 [7] and K-induction [23] to deal with symbolic timed transition systems. It supports both invariant and MTL0*,*<sup>∞</sup> properties, although for the latter it only supports bounded model checking. CTAV [24] reduces the model checking problem for an MTL0*,*<sup>∞</sup> property ϕ to a symbolic language emptiness check of a timed Buchi automata for ¨ ϕ.

Differently from all the above tools NUXMV is able to prove MTL0*,*<sup>∞</sup> properties on timed transition systems with infinite domain variables.

#### **6 Experimental Evaluation**

We compared NUXMV with Atmoc [22], CTAV [24], ZOT [21], Divine [19], LTSmin [18], and Uppaal [25].

For the evaluation we considered (i) scalable benchmarks taken from competitor tools distributions and from the literature; (ii) handcrafted benchmarks to stress various language features. In particular, we considered different versions of the Fisher mutual exclusion protocol (correct and buggy) with different properties, different versions of the emergency diesel generator problem (previously studied with Atmoc [22]). Finally we considered also the validity checks of some MTL properties also taken from [22].

**Fig. 3.** Runtime for the Fisher mutual exclusion problem; x-axis: number of processes, y-axis: time (s). LTL-1 and MTL-1 properties are the bounded version of resp. LTL-0 and MTL-0.

We run all the experiments on a PC equipped with a 3.7 GHz Xeon quad core CPU and 16 Gb of RAM, using a time/memory limit of 1000 s/10 Gb for each test. We refer the reader to [26] to retrieve all the data to reproduce this experimental evaluation.

The results of the evaluation are reported in Fig. 3 for the Fisher family of experiments, and in Fig. 4 for the emergency diesel generator family of problems (CTAV does not appear in the plot of MTL-0 because it wrongly reports a counterexample although MTL-0 is the bounded version of LTL-0). While the results for the validity check of pure MTL properties are reported in Fig. 5. In the plots NUXMV refers to runtime for the IC3 with implicit abstraction in lockstep with BMC with the modified loop condition algorithm, and NUXMV-bmc refers to runtime for BMC alone with the modified loop condition algorithm. The results show that NUXMV is competitive with and in many cases performs better than other state-of-the-art tools, especially on validity problems for MTL0*,*∞.

**Fig. 4.** Result for the runtime (s) for the emergency diesel generator family of problems: NUXMV (x) vs Atmoc (y).

**Fig. 5.** Runtime (s) for the validity checks of MTL properties.

#### **7 Conclusions**

We presented the new version of NUXMV, a state-of-the art symbolic model checker for finite and infinite-state transition systems, that we extended to allow for the specification of synchronous timed transition systems and of MTL0*,*<sup>∞</sup> properties. To support the new features, we extended the NUXMV language, we allowed for the specification MTL0*,*<sup>∞</sup> formulas with parametric intervals, we adapted the model checking algorithms to find for lasso-shaped traces (over discrete semantics) where clocks may diverge. We evaluated the new features comparing NUXMV with other verification tools for timed automata, considering different benchmarks. The results show that NUXMV is competitive with and in many cases performs better than state-of-the-art tools, especially on validity problems for MTL0*,*∞.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### Cerberus-BMC: A Principled Reference Semantics and Exploration Tool for Concurrent and Sequential C

Stella Lau1,2(B) , Victor B. F. Gomes<sup>2</sup>, Kayvan Memarian<sup>2</sup>, Jean Pichon-Pharabod<sup>2</sup>, and Peter Sewell<sup>2</sup>

<sup>1</sup> MIT, Cambridge, USA stellal@mit.edu <sup>2</sup> University of Cambridge, Cambridge, UK {victor.gomes,kayvan.memarian, jean.pichon-pharabod,peter.sewell}@cl.cam.ac.uk

Abstract. C remains central to our infrastructure, making verification of C code an essential and much-researched topic, but the semantics of C is remarkably complex, and important aspects of it are still unsettled, leaving programmers and verification tool builders on shaky ground. This paper describes a tool, Cerberus-BMC, that for the first time provides a principled reference semantics that simultaneously supports (1) a choice of concurrency memory model (including substantial fragments of the C11, RC11, and Linux kernel memory models), (2) a modern memory object model, and (3) a well-validated thread-local semantics for a large fragment of the language. The tool should be useful for C programmers, compiler writers, verification tool builders, and members of the C/C++ standards committees.

#### 1 Introduction

C remains central to our infrastructure, widely used for security-critical components of hypervisors, operating systems, language runtimes, and embedded systems. This has prompted much research on the verification of C code, but the semantics of C is remarkably complex, and important aspects of it are still unsettled, leaving programmers and verification tool builders on shaky ground. Here we are concerned with three aspects:

*1. The Concurrency Memory Model.* The 2011 versions of the ISO C++ and C standards adopted a new concurrency model [3,12,13], formalised during the development process [11], but the model is still in flux: various fixes have been found to be necessary [9,14,26]; the model still suffers from the "thin-air problem" [10,15,35]; and Linux kernel C code uses a different model, itself recently partially formalised [7].

*2. The Memory Object Model.* A priori, one might imagine C follows one of two language-design extremes: a concrete byte-array model with pointers that are simply machine words, or an abstract model with pointers combining abstract block IDs and structured offsets. In fact C is neither of these: it permits casts between pointer and integer types, and manipulation of their byte representations, to support low-level systems programming, but, while at runtime a C pointer will typically just be a machine word, compiler analyses and optimisations reason about abstract notions of the provenance of pointers [27,29,31]. This is a subject of active discussion in the ISO C and C++ committees and in compiler development communities.

*3. The Thread-Local Sequential Semantics.* Here, there are many aspects, e.g. the loosely specified evaluation order, the semantics of integer promotions, many kinds of undefined behaviour, and so on, that are (given an expert reading) reasonably well-defined in the standard, but that are nonetheless very complex and widely misunderstood. The standard, being just a prose document, is not *executable as a test oracle*; it is not a reference semantics usable for exploration or automated testing.

Each of these is challenging in isolation, but there are also many subtle interactions between them. For example, between (1) and (3), the pre-C11 ISO standard text was in terms of sequential stepwise execution of an (informally specified) abstract machine, while the C11 concurrency model is expressed as a predicate over complete candidate executions, and the two have never been fully reconciled – e.g. in the standard's treatment of object lifetimes. Then there are fundamental issues in combining the ISO treatment of undefined behaviour with that axiomatic-concurrency-model style [10, §7]. Between (1) and (2), one has to ask about the relationships between the definition of data race and the treatment of uninitialised memory and padding. Between (2) and (3), there are many choices for what the C memory object model should be, and how it should be integrated with the standard, which are currently under debate. Between all three one has to consider the relationships between uninitialised and thin-air values and the ISO notions of unspecified values and trap representations. These are all open questions in what the C semantics and ISO standard are (or should be). We do not solve them here, but we provide a necessary starting point: a tool embodying a precise reference semantics that lets one explore examples and debate the alternatives.

We describe a tool, Cerberus-BMC, that for the first time lets one explore the allowed behaviours of C test programs that involve all three of the above. It is available via a web interface at http://cerberus.cl.cam.ac.uk/bmc.html.

For (1), Cerberus-BMC is parameterised on an axiomatic memory concurrency model: it reads in a definition of the model in a Herd-like format [6], and so can be instantiated with (substantial fragments of) either the C11 [3,9,12–14], RC11 [26], or Linux kernel [7] memory models. The model can be edited in the web interface. Then the user can load (or edit in the web interface) a small C program. The tool first applies the Cerberus compositional translation (or elaboration) into a simple Core language, as in [29,31]; this elaboration addresses (3) by making many of the thread-local subtleties of C explicit, including the loose specification of evaluation order, arithmetic conversions, implementation-defined behaviour, and many kinds of undefined behaviour. Core computation is simply over mathematical integers, with explicit memory actions to interface with the concurrency and memory object models. However, there is a mismatch between the axiomatic style of the concurrency models for C (expressed as predicates on arbitrary candidate executions) with the operational style of the previous thread-local operational semantics for Core. We address this by replacing the latter with a new translation from Core into SMT problems. This is integrated with the concurrency model, also translated into SMT, following the ideas of [5]. These are furthermore integrated with an SMT version of parts of the PNVI (provenance-not-via-integers) memory object model of [29], the basis for ongoing work within the ISO WG14 C standards committee, addressing (2). The resulting SMT problems are passed to Z3 [32]. The web interface then provides a graphical view of the allowed concurrent executions for small test programs.

The Cerberus-BMC tool should be useful for programmers, compiler writers, verification tool builders, and members of the C/C++ standards committees. We emphasise that it is intended as an executable reference semantics for small test programs, not itself as a verification tool that can be applied to larger bodies of C: we have focussed on making it transparently based on principled semantics for all three aspects, without the complexities needed for a high-performance verification tool. But it should aid the construction of such.

*Caveats and Limitations.* Cerberus-BMC covers many features of 1–3, but far from all. With respect to the concurrency memory model, we support substantial fragments of the C11, RC11, and Linux kernel memory models. We omit locks and the (deprecated) C11/RC11 consume accesses. We only cover compareexchange read-modify-write operations, and the fragment of RCU restricted to read\_rcu\_lock(), read\_rcu\_unlock(), and synchronize\_rcu() used in a linear way, without control-flow-dependent calls to RCU, and without nesting.

With respect to the memory object model, we do not currently support dynamic allocation or manipulation of byte representations (such as with char\* pointers), and we do not address issues such as subobject provenance (an open question within WG14).

With respect to the thread semantics, our translation to SMT does not currently cover arbitrary pointer type-casting, function pointers, multi-dimensional arrays, unions, floating point, bitwise operations, and variadic functions, and only covers simple structs. In addition, we inherit the limitations of the Cerberus thread semantics as per [29].

*Related Work.* There is substantial prior work on tools for concurrency semantics and for C semantics, but almost none that combines the two. On the concurrency semantics side, CppMem [1,11] is a web-interface tool that computes the allowed concurrent behaviours of small tests with respect to variants (now somewhat outdated) of the C11 model, but it does not support other concurrency models or a memory object model, and it supports only a small fragment of C. Herd [6,8] is a command-line tool that computes the allowed concurrent behaviours of small tests with respect to arbitrary axiomatic concurrency models expressed in its cat language, but without a memory object model and for tests which essentially just comprise memory events, without a C semantics. MemAlloy [38] and MemSynth [16] also support reasoning about axiomatic concurrency models, but again not integrated with a C language semantics.

On the C semantics side, several projects address sequential C semantics but without concurrency. We build here on Cerberus [28,29,31], a web-interface tool that computes the allowed behaviours (interactively or exhaustively) for moderate-sized tests in a substantial fragment of sequential C, incorporating various memory object models (an early version supported Nienhuis's operational model for C11 concurrency [33], but that is no longer integrated). KCC and RV-Match [19,21,22] provide a command-line semantics tool for a substantial fragment of C, again without concurrency. Krebbers gives a Coq semantics for a somewhat smaller fragment [24].

Then there is another large body of work on model-checking tools for sequential and concurrent C. These are all optimised for model-checking performance, in contrast to the Cerberus-BMC emphasis on expressing the semantic envelope of allowed behaviour as clearly as we can (and, where possible, closely linked to the ISO standard). The former include tis-interpreter [18,36], CBMC [17,25], and ESBMC [20]. On the concurrent side, as already mentioned, we build on the approach of [5], which integrated various hardware memory concurrency models with CBMC. CDSChecker [34] supports something like the C/C++11 concurrency model, but subject to various limitations [34, §1.3]. It is implemented using a dynamically-linked shared library for the C and C++ atomic types, so implicitly adopts the C semantic choices of whichever compiler is used. RCMC [23], supports memory models that do not exhibit Load Buffering (LB), for an idealised thread-local language. Nidhugg [4] supports only hardware memory models: SC, TSO, PSO, and versions of POWER and ARM.

#### 2 Examples

We now illustrate some of what Cerberus-BMC can do, by example.

*Concurrency Models.* First, for C11 concurrency, Fig. 1 shows a screenshot for a classic message-passing test, with non-atomic writes and reads of x, synchronised with release/acquire writes and reads of y. The test uses an explicit parallel composition, written , to avoid the noise from the extra memory actions in pthread\_create. The consistent race-free UB-free execution on the right shows the synchronisation working correctly: after the i read-acquire of y=1, the l non-atomic read of x has to read x=1 (there are no consistent executions in which it does not). As usual in C/C++ candidate execution graphs, rf are reads-from edges, sb is sequenced-before (program order), mo is modification

Fig. 1. Cerberus-BMC Screenshot: C11 Release/Acquire Message Passing. If the read of y is 1, then the last thread has to see the write of 1 to x.

Fig. 2. Linux kernel memory model RCU lock. Without synchronize\_rcu(), the reads of x and y can see 0 and 1 (as shown), even though they are enclosed in an RCU lock. With synchronization, after reading x=1, the last thread has to see y=1.

order (the coherence order between atomic writes to the same address), and asw is additional-synchronised-with, between parent and child threads and vice versa. Read and write events (R/W) are annotated na for non-atomic and rel/acq for release/acquire.

For the Linux kernel memory model, the example in Fig. 2 shows an RCU (read-copy-update) synchronisation.

*Memory Object Model.* The example below illustrates a case where one cannot assume that C has a concrete memory object model: pointer provenance matters.

In some C implementations, x and y will happen to be allocated adjacent (the \_ \_BMC\_ASSUME restricts attention to those executions). Then &x+1 will have the same numeric address as &y, but the write \*p=11 is undefined behaviour rather than a write to y. This was informally described in the 2004 ISO WG14 C standards committee response to

```
#include <stdint.h>
int x = 1, y = 2;
int main() {
  int *p = &x + 1;
  int *q = &y;
  __BMC_ASSUME((intptr_t)p==(intptr_t)q);
  if ((intptr_t)p==(intptr_t)q)
    *p = 11; // does this have UB?
 }
```
Defect Report 260 [37], but has never been incorporated into the standard itself. Cerberus-BMC correctly reports UB found: source.c:8:5-7, UB043\_indirection\_invalid\_value following the PNVI (provenance-not-viaintegers) memory object model of [29].

*ISO Subtleties.* Turning to areas where the ISO standard is clear to experts but widely misunderstood, in the example on the right ISO leaves it implementation-

defined whether **char** is signed or unsigned. In the former case, the ISO integer promotion and conversion semantics will make the equality test false, leading to a division by 0, which is undefined behaviour.


} The example below shows the correct treatment of the ISO standard's loose specification of evaluation order, together with detection of the concurrency model's *unsequenced races* (ur in the diagram): there are write and read accesses to x that are unrelated by sequenced-before (sb), and not otherwise synchronised and hence unrelated by happens-before, which makes this program undefined behaviour.

*Treiber Stack.* Finally, demonstrating the combination of all three aspects, we implemented a modified Treiber stack (the push() function is shown in Fig. 3) with relaxed accesses to struct fields. Although the Treiber stack is traditionally implemented by spinning on a compare-and-swap, as that can spin unboundedly, we instead use \_ \_BMC\_ASSUME to restrict executions to those where the compare-and-swap succeed. Our tool correctly detects the different results from the concurrent relaxed-memory execution of threads concurrently executing the push and pop functions.

Fig. 3. Treiber stack push()

Fig. 4. Core program corresponding to **int** main(){**int** x = 1}. Core is essentially a typed, first-order lambda calculus with explicit memory actions such as create and store to interface with the concurrency and memory object models.

#### 3 Implementation

After translating a C program into Core (see Fig. 4), Cerberus-BMC does a sequence of Core-to-Core rewrites in the style of bounded model checkers such as CBMC: it unwinds loops and inlines function calls (to a given bound), and renames symbols to generate an SSA-style program.

The explicit representation of memory operations in Core as first-order constructs allows the SMT translation to be easily separated into three components: the translation from Core to SMT, the memory object model constraints, and the concurrency model constraints.

*1. Core to SMT.* Each value in Core is represented as an SMT expression, with fresh SMT constants for memory actions such as create and store (e.g. lines 2 and 4), the concrete values of which are constrained by the memory object and concurrency models. The elaboration of C to Core makes thread-local undefined behaviour (as opposed to undefined behaviour from concurrency or memory layout), like signed integer overflow, explicit with a primitive undef construct. Undefined behaviour is then encoded in SMT as reachability of undef expressions, that is, satisfiability of the control-flow guards up to them.

*2. Memory Object Model.* As in the PNVI semantics [30], Cerberus-BMC represents pointers as pairs (π, a) of a provenance π and an integer address a. The provenance of a pointer is taken into account when doing memory accesses, pointer comparisons, and casts between integer and pointer values. Our tool models address allocation nondeterminism by constraining address values based on allocations to be appropriately aligned and non-overlapping, but not constraining the addresses otherwise.

*3. Concurrency Model.* Cerberus-BMC statically extracts memory actions and computes an extended pre-execution containing relations such as program order. As control flow can not be statically determined, memory actions are associated with an SMT boolean guard representing the control flow conditions upon which the memory action is executed.

Cerberus-BMC reads in a model definition in a subset of the herd cat language large enough to express C11, RC11, and Linux, and generates a set of quantifier-free SMT expressions corresponding to the model's constraints on relations. These constraints are based on a set of "built-in" relations defined in SMT such as rf. Cerberus-BMC then queries Z3 to extract all the executions, displaying the load/store values and computed relations for the user.

#### 4 Validation

We validate correctness of the three aspects of Cerberus-BMC as follows, though, as ever, additional testing would be desirable. Performance data, demonstrating practical usability, is from a MacBook Pro 2.9 GHz Intel Core i5.

For C11 and RC11 concurrency, we check on 12 classic litmus tests. For Linux kernel concurrency, we hand-translated the 9 non-RCU tests and 4 of the RCU tests of [7] into C, and automatically translated the 40 tests of [2]. Running all the non-RCU tests takes less than 5 min; the RCU tests are slower, of the order of one hour, perhaps because of the recursive definitions involved.

For the memory object model, we take the supported subset (36 tests) of the provenance semantics test suite of [29]. These single-threaded tests each run in less than a second.

For the thread-local semantics, the Cerberus pipeline to Core has previously been validated using GCC Torture, Toyota ITC, KCC, and Csmith-generated test suites [29]. We check the mapping to BMC using 50 hand-written tests and the supported subset (400 tests) of the Toyota ITC test suite, each running in less than two minutes.

These test suites and the examples in the paper can be accessed via the CAV 2019 pop-up in the File menu of the tool.

Acknowledgments. This work was partially supported by EPSRC grant EP/ K008528/1 (REMS), ERC Advanced Grant ELVER 789108, and an MIT EECS Graduate Alumni Fellowship.

### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Cyber-Physical Systems and Machine Learning

### **Multi-armed Bandits for Boolean Connectives in Hybrid System Falsification**

Zhenya Zhang1,2(B) , Ichiro Hasuo1,2 , and Paolo Arcaini<sup>1</sup>

<sup>1</sup> National Institute of Informatics, Tokyo, Japan {zhangzy,hasuo,arcaini}@nii.ac.jp <sup>2</sup> SOKENDAI (The Graduate University for Advanced Studies), Hayama, Japan

**Abstract.** *Hybrid system falsification* is an actively studied topic, as a scalable quality assurance methodology for real-world cyber-physical systems. In falsification, one employs stochastic hill-climbing optimization to quickly find a counterexample input to a black-box system model. Quantitative *robust semantics* is the technical key that enables use of such optimization. In this paper, we tackle the so-called *scale problem* regarding Boolean connectives that is widely recognized in the community: quantities of different scales (such as speed [km/h] vs. rpm, or worse, rph) can mask each other's contribution to robustness. Our solution consists of integration of the *multi-armed bandit* algorithms in hill climbing-guided falsification frameworks, with a technical novelty of a new reward notion that we call *hill-climbing gain*. Our experiments show our approach's robustness under the change of scales, and that it outperforms a state-of-the-art falsification tool.

#### **1 Introduction**

*Hybrid System Falsification.* Quality assurance of *cyber-physical systems (CPS)* is attracting growing attention from both academia and industry, not only because it is challenging and scientifically interesting, but also due to the safety-critical nature of many CPS. The combination of physical systems (with continuous dynamics) and digital controllers (that are inherently discrete) is referred to as *hybrid systems*, capturing an important aspect of CPS. To verify hybrid systems is intrinsically hard, because the continuous dynamics therein leads to infinite search spaces.

More researchers and practitioners are therefore turning to *optimization-based falsification* as a quality assurance measure for CPS. The problem is formalized as follows.

The authors are supported by ERATO HASUO Metamathematics for Systems Design Project (No. JPMJER1603), JST.

#### **The falsification problem**

– **Given:** a *model* M (that takes an input signal **u** and

yields an output signal <sup>M</sup>(**u**)), and a *specification* <sup>ϕ</sup> (a **<sup>u</sup>** - M M(**u**) |=ϕ ?


temporal formula)

– **Find:** a *falsifying input*, that is, an input signal **u** such that the corresponding output <sup>M</sup>(**u**) violates <sup>ϕ</sup>

In optimization-based falsification, the above problem is turned into an optimization problem. It is *robust semantics* of temporal formulas [12,17] that makes it possible. Instead of the Boolean satisfaction relation **<sup>v</sup>** <sup>|</sup>= <sup>ϕ</sup>, robust semantics assigns a quantity **<sup>v</sup>**, ϕ <sup>∈</sup> <sup>R</sup>∪ {∞, −∞} that tells us, not only whether <sup>ϕ</sup> is true or not (by the sign), but also *how robustly* the formula is true or false. This allows one to employ hill-climbing optimization: we iteratively generate input signals, in the direction of decreasing robustness, hoping that eventually we hit negative robustness.

**Table 1.** Boolean satisfaction **w** <sup>|</sup><sup>=</sup> <sup>ϕ</sup>, and quantitative robustness values **w**, ϕ, of three signals of *speed* for the STL formula ϕ ≡ -[0,30](*speed* < 120)

An illustration of robust semantics is in Table 1. We use *signal temporal logic (STL)* [12], a temporal logic that is commonly used in hybrid system specification. The specification says the speed must always be below 120 during the time interval [0, 30]. In the search of an input signal **u** (e.g. of throttle and brake) whose corresponding output <sup>M</sup>(**u**) violates the specification, the quantitative robustness -<sup>M</sup>(**u**), ϕ gives much more information than the Boolean satisfaction <sup>M</sup>(**u**) <sup>|</sup>= <sup>ϕ</sup>. Indeed, in Table 1, while Boolean satisfaction fails to discriminate the first two signals, the quantitative robustness indicates a tendency that the second signal is closer to violation of the specification.

In the falsification literature, stochastic algorithms are used for hill-climbing optimization. Examples include simulated annealing (SA), globalized Nelder-Mead (GNM [30]) and covariance matrix adaptation evolution strategy (CMA-ES [6]). Note that the system model M can be black-box: we have only to observe the correspondence between input **<sup>u</sup>** and output <sup>M</sup>(**u**). Observing an error <sup>M</sup>(**u** ) for some input **<sup>u</sup>** is sufficient evidence for a system designer to know that the system needs improvement. Besides these practical advantages, optimization-based falsification is an interesting scientific topic: it combines two different worlds of formal reasoning and stochastic optimization.

Optimization-based falsification started in [17] and has been developed vigorously [1,3–5,9,11–13,15,27,28,34,36,38]. See [26] for a survey. There are mature tools such as Breach [11] and S-Taliro [5]; they work with industry-standard Simulink models.

*Challenge: The Scale Problem in Boolean Superposition.* In the field of hybrid falsification—and more generally in search-based testing—the following problem is widely recognized. We shall call the problem *the scale problem (in Boolean superposition)*.

Consider an STL specification ϕ ≡ -[0,30](¬(*rpm* <sup>&</sup>gt; 4000) <sup>∨</sup> (*speed* <sup>&</sup>gt; 20)) for a car; it is equivalent to -[0,30]((*rpm* <sup>&</sup>gt; 4000) <sup>→</sup> (*speed* <sup>&</sup>gt; 20)) and says that the speed should not be too small whenever the rpm is over 4000. According to the usual definition in the literature [11,17], the Boolean connectives ¬ and ∨ are interpreted by − and the supremum , respectively; and the "always" operator -[0,30] is by infimum -. Therefore the robust semantics of <sup>ϕ</sup> under the signal (*rpm*, *speed*), where *rpm*, *speed* : [0, 30] <sup>→</sup> <sup>R</sup>, is given as follows.

$$\left[ \left( rpm, speed \right), \varphi \right] = \prod\_{t \in \left[ 0, 30 \right]} \left( \left( 4000 - rpm(t) \right) \sqcup \left( speed(t) - 20 \right) \right) \tag{1}$$

A problem is that, in the supremum of two real values in (1), one component can totally *mask* the contribution of the other. In this specific example, the former (*rpm*) component can have values as big as thousands, while the latter (*speed*) component will be in the order of tens. This means that in hill-climbing optimization it is hard to use the information of both signals, as one will be masked.

Another related problem is that the efficiency of a falsification algorithm would depend on the choice of units of measure. Imagine replacing rpm with rph in (1), which makes the constant 4000 into 240000, and make the situation even worse.

These problems—that we call the *scale problem*—occur in many falsification examples, specifically when a specification involves Boolean connectives. We do need Boolean connectives in specifications: for example, many real-world specifications in industry are of the form -<sup>I</sup> (ϕ<sup>1</sup> <sup>→</sup> <sup>ϕ</sup><sup>2</sup>), requiring that an event <sup>ϕ</sup><sup>1</sup> triggers a countermeasure ϕ<sup>2</sup> all the time.

One could use different operators for interpreting Boolean connectives. For example, in [21], <sup>∨</sup> and <sup>∧</sup> are interpreted by + and <sup>×</sup> over <sup>R</sup>, respectively. However, these choices do not resolve the scale problem, either. In general, it does not seem easy to come up with a fixed set of operators over R that interpret Boolean connectives and are free from the scale problem.

*Contribution: Integrating Multi-Armed Bandits into Optimization-Based Falsification.* As a solution to the scale problem in Boolean superposition that we just described, we introduce a new approach that does *not* superpose robustness values. Instead, we integrate *multiarmed bandits (MAB)* in the existing framework of falsification guided by hill-climbing optimization.

**Fig. 1.** A multi-armed bandit for falsifying -<sup>I</sup> (ϕ<sup>1</sup> ∧ ϕ2)

The MAB problem is a prototypical reinforcement learning problem: a gambler sits in front of a row of slot machines; their performance (i.e. average reward) is not known; the gambler plays a machine in each round and he continues with many rounds; and the goal is to optimize cumulative rewards. The gambler needs to play different machines and figure out their performance, at the cost of the loss of opportunities in the form of playing suboptimal machines.

In this paper, we focus on specifications of the form -<sup>I</sup> (ϕ<sup>1</sup> <sup>∧</sup> <sup>ϕ</sup><sup>2</sup>) and -<sup>I</sup> (ϕ<sup>1</sup> <sup>∨</sup> <sup>ϕ</sup><sup>2</sup>); we call them *(conjunctive/disjunctive) safety properties*. We identify an instance of the MAB problem in the choice of the formula (out of ϕ1, ϕ2) to try to falsify by hill climbing. See Fig. 1. We combine MAB algorithms (such as ε-greedy and UCB1, see Sect. 3.2) with hill-climbing optimization, for the purpose of coping with the scale problem in Boolean superposition. This combination is made possible by introducing a novel reward notion for MAB, called *hill-climbing gain*, that is tailored for this purpose.

We have implemented our MAB-based falsification framework in MATLAB, building on Breach [11].<sup>1</sup> Our experiments with benchmarks from [7,24,25] demonstrate that our MAB-based approach is a viable one against the scale problem. In particular, our approach is observed to be (almost totally) robust under the change of scaling (i.e. changing units of measure, such as from rpm to rph that we discussed after the formula (1)). Moreover, for the benchmarks taken from the previous works—they do not suffer much from the scale problem—our algorithm performs better than the state-ofthe-art falsification tool Breach [11].

*Related Work.* Besides those we mentioned, we shall discuss some related works.

Formal verification approaches to correctness of hybrid systems employ a wide range of techniques, including model checking, theorem proving, rigorous numerics, nonstandard analysis, and so on [8,14,18,20,22,23,29,32]. These are currently not very successful in dealing with complex real-world systems, due to issues like scalability and black-box components.

Our use of MAB in falsification exemplifies the role of the *exploration-exploitation trade-off*, the core problem in reinforcement learning. The trade-off has been already discussed for the verification of quantitative properties (e.g., [33]) and also in some works on falsification. A recent example is [36], where they use Monte Carlo tree search to force systematic exploration of the space of input signals. Besides MCTS, *Gaussian process learning (GP learning)* has also attracted attention in machine learning as a clean way of balancing exploitation and exploration. The GP-UCB algorithm is a widely used strategy there. Its use in hybrid system falsification is pursued e.g. in [3,34].

More generally, *coverage-guided falsification* [1,9,13,28] aims at coping with the exploration-exploitation trade-off. One can set the current work in this context—the difference is that we force systematic exploration on the specification side, not in the input space.

There have been efforts to enhance expressiveness of MTL and STL, so that engineers can express richer intentions—such as time robustness and frequency—in speci-

<sup>1</sup> Code obtained at https://github.com/decyphir/breach.

fications [2,31]. This research direction is orthogonal to ours; we plan to investigate the use of such logics in our current framework.

A similar masking problem around Boolean connectives is discussed in [10,19]. Compared to those approaches, our technique does not need the explicit declaration of *input vacuity* and *output robustness*, but it relies on the "hill-climbing gain" reward to learn the significance of each signal.

Finally, the interest in the use of deep neural networks is rising in the field of falsification (as well as in many other fields). See e.g. [4,27].

#### **2 Preliminaries: Hill Climbing-Guided Falsification**

We review a well-adopted methodology for hybrid system falsification, namely the one guided by hill-climbing optimization. It makes essential use of quantitative *robust semantics* of temporal formulas, which we review too.

#### **2.1 Robust Semantics for STL**

Our definitions here are taken from [12,17].

**Definition 1 ((time-bounded) signal).** Let <sup>T</sup> <sup>∈</sup> <sup>R</sup><sup>+</sup> be a positive real. An <sup>M</sup>*dimensional signal* with a time horizon <sup>T</sup> is a function **<sup>w</sup>**: [0, T] <sup>→</sup> <sup>R</sup><sup>M</sup>.

Let **<sup>w</sup>**: [0, T] <sup>→</sup> <sup>R</sup><sup>M</sup> and **<sup>w</sup>** : [0, T ] <sup>→</sup> <sup>R</sup><sup>M</sup> be <sup>M</sup>-dimensional signals. Their *concatenation* **<sup>w</sup>** · **<sup>w</sup>** : [0, T <sup>+</sup> <sup>T</sup> ] <sup>→</sup> <sup>R</sup><sup>M</sup> is the <sup>M</sup>-dimensional signal defined by (**<sup>w</sup>** · **<sup>w</sup>** )(t) = **<sup>w</sup>**(t) if <sup>t</sup> <sup>∈</sup> [0, T], and (**<sup>w</sup>** · **<sup>w</sup>** )(t) = **<sup>w</sup>** (<sup>t</sup> <sup>−</sup> <sup>T</sup>) if <sup>t</sup> <sup>∈</sup> (T,T + <sup>T</sup> ].

Let <sup>0</sup> < T<sup>1</sup> < T<sup>2</sup> <sup>≤</sup> <sup>T</sup>. The *restriction* **<sup>w</sup>**|[T1,T2] : [0, T<sup>2</sup> <sup>−</sup> <sup>T</sup>1] <sup>→</sup> <sup>R</sup><sup>M</sup> of **<sup>w</sup>**: [0, T] <sup>→</sup> <sup>R</sup><sup>M</sup> to the interval [T1, T2] is defined by (**w**|[T1,T2])(t) = **<sup>w</sup>**(T<sup>1</sup> <sup>+</sup> <sup>t</sup>).

One main advantage of optimization-based falsification is that a system model can be a black box—observing the correspondence between input and output suffices. We therefore define a system model simply as a function.

**Definition 2 (system model** M**).** A *system model*, with M-dimensional input and Ndim. output, is a function <sup>M</sup> that takes an input signal **<sup>u</sup>**: [0, T] <sup>→</sup> <sup>R</sup><sup>M</sup> and returns a signal <sup>M</sup>(**u**): [0, T] <sup>→</sup> <sup>R</sup><sup>N</sup> . Here the common time horizon <sup>T</sup> <sup>∈</sup> <sup>R</sup><sup>+</sup> is arbitrary. Furthermore, we impose the following *causality* condition on M: for any time-bounded signals **<sup>u</sup>**: [0, T] <sup>→</sup> <sup>R</sup><sup>M</sup> and **<sup>u</sup>** : [0, T ] <sup>→</sup> <sup>R</sup><sup>M</sup>, we require that <sup>M</sup>(**<sup>u</sup>** · **<sup>u</sup>** ) [0,T] = <sup>M</sup>(**u**).

**Definition 3 (STL syntax).** We fix a set **Var** of variables. In STL, *atomic propositions* and *formulas* are defined as follows, respectively: <sup>α</sup> ::<sup>≡</sup> <sup>f</sup>(x1,...,x<sup>N</sup> ) <sup>&</sup>gt; <sup>0</sup>, and <sup>ϕ</sup> ::<sup>≡</sup> <sup>α</sup> |⊥|¬<sup>ϕ</sup> <sup>|</sup> <sup>ϕ</sup> <sup>∧</sup> <sup>ϕ</sup> <sup>|</sup> <sup>ϕ</sup> <sup>∨</sup> <sup>ϕ</sup> <sup>|</sup> <sup>ϕ</sup> <sup>U</sup><sup>I</sup> <sup>ϕ</sup>. Here <sup>f</sup> is an <sup>N</sup>-ary function <sup>f</sup> : <sup>R</sup><sup>N</sup> <sup>→</sup> <sup>R</sup>, <sup>x</sup>1,...,x<sup>N</sup> <sup>∈</sup> **Var**, and <sup>I</sup> is a closed non-singular interval in <sup>R</sup>≥<sup>0</sup>, i.e. <sup>I</sup> = [a, b] or [a,∞) where a, b <sup>∈</sup> <sup>R</sup> and a<b.

We omit subscripts <sup>I</sup> for temporal operators if <sup>I</sup> = [0,∞). Other common connectives such as →, , -<sup>I</sup> (always) and ♦<sup>I</sup> (eventually), are introduced as abbreviations: ♦Iϕ ≡U<sup>I</sup> ϕ and -<sup>I</sup><sup>ϕ</sup> ≡ ¬♦I¬ϕ. An atomic formula <sup>f</sup>(*x*) <sup>≤</sup> <sup>c</sup>, where <sup>c</sup> <sup>∈</sup> <sup>R</sup>, is accommodated using ¬ and the function f (*x*) :=f(*x*) <sup>−</sup> <sup>c</sup>.

**Definition 4 (robust semantics** [12]**).** Let **<sup>w</sup>**: [0, T] <sup>→</sup> <sup>R</sup><sup>N</sup> be an <sup>N</sup>-dimensional signal, and <sup>t</sup> <sup>∈</sup> [0, T). The <sup>t</sup>*-shift* of **<sup>w</sup>**, denoted by **<sup>w</sup>**<sup>t</sup> , is the time-bounded signal **<sup>w</sup>**<sup>t</sup> : [0, T <sup>−</sup> <sup>t</sup>] <sup>→</sup> <sup>R</sup><sup>N</sup> defined by **<sup>w</sup>**<sup>t</sup> (t ) :=**w**(<sup>t</sup> + <sup>t</sup> ).

Let **<sup>w</sup>**: [0, T] <sup>→</sup> <sup>R</sup>|**Var**<sup>|</sup> be a signal, and <sup>ϕ</sup> be an STL formula. We define the *robustness* **<sup>w</sup>**, ϕ <sup>∈</sup> <sup>R</sup> ∪ {∞, −∞} as follows, by induction on the construction of formulas. Here and denote infimums and supremums of real numbers, respectively. Their binary version and denote minimum and maximum.

$$\begin{aligned} \left[\mathbf{w}, f(x\_1, \cdots, x\_n) > 0\right] &:= f\left(\mathbf{w}(0)(x\_1), \cdots, \mathbf{w}(0)(x\_n)\right) \\ \left[\mathbf{w}, \bot\right] &:= -\infty \qquad \left[\mathbf{w}, \neg\varphi\right] := -\left[\mathbf{w}, \varphi\right] \\ \left[\mathbf{w}, \varphi\_1 \land \varphi\_2\right] &:= \left[\mathbf{w}, \varphi\_1\right] \sqcap \left[\mathbf{w}, \varphi\_2\right] \qquad \left[\mathbf{w}, \varphi\_1 \lor \varphi\_2\right] := \left[\mathbf{w}, \varphi\_1\right] \sqcup \left[\mathbf{w}, \varphi\_2\right] \\ \left[\mathbf{w}, \varphi\_1 \wr\_{I\_1} \varphi\_2\right] &:= \bigsqcup\_{t \in I \cap [0, T]} \left(\left[\mathbf{w}^t, \varphi\_2\right] \sqcap \left[\prod\_{t' \in [0, t]} \left[\mathbf{w}^{t'}, \varphi\_1\right]\right]\right) \end{aligned} \tag{2}$$

For atomic formulas, **<sup>w</sup>**, f(*x*) > c stands for the vertical margin <sup>f</sup>(*x*) <sup>−</sup> <sup>c</sup> for the signal **<sup>w</sup>** at time 0. A negative robustness value indicates how far the formula is from being true. It follows from the definition that the robustness for the eventually modality is given by **<sup>w</sup>**,♦[a,b](x > 0) = - t∈[a,b]∩[0,T] **<sup>w</sup>**(t)(x).

The above robustness notion taken from [12] is therefore *spatial*. Other robustness notions take *temporal* aspects into account, too, such as "how long before the deadline the required event occurs". See e.g. [2,12]. Our choice of spatial robustness in this paper is for the sake of simplicity, and is thus not essential.

The original semantics of STL is Boolean, given as usual by a binary relation <sup>|</sup>= between signals and formulas. The robust semantics refines the Boolean one in the following sense: **<sup>w</sup>**, ϕ <sup>&</sup>gt; 0 implies **<sup>w</sup>** <sup>|</sup>= <sup>ϕ</sup>, and **<sup>w</sup>**, ϕ <sup>&</sup>lt; 0 implies **<sup>w</sup>** |= <sup>ϕ</sup>, see [17, Prop. 16]. Optimization-based falsification via robust semantics hinges on this refinement.

#### **2.2 Hill Climbing-Guided Falsification**

As we discussed in the introduction, the falsification problem attracts growing industrial and academic attention. Its solution methodology by hill-climbing optimization is an established field, too: see [1,3,5,9,11–13,15,26,28,34,38] and the tools Breach [11] and S-TaLiRo [5]. We formulate the problem and the methodology, for later use in describing our multi-armed bandit-based algorithm.

**Definition 5 (falsifying input).** Let M be a system model, and ϕ be an STL formula. A signal **<sup>u</sup>**: [0, T] <sup>→</sup> <sup>R</sup>|**Var**<sup>|</sup> is a *falsifying input* if -<sup>M</sup>(**u**), ϕ <sup>&</sup>lt; 0; the latter implies <sup>M</sup>(**u**) |= <sup>ϕ</sup>.

The use of quantitative robust semantics -<sup>M</sup>(**u**), ϕ <sup>∈</sup> <sup>R</sup> ∪ {∞, −∞} in the above problem enables the use of hill-climbing optimization.

**Definition 6 (hill climbing-guided falsification).** Assume the setting in Definition 5. For finding a falsifying input, the methodology of *hill climbing-guided falsification* is presented in Algorithm 1.

Here the function HILL-CLIMB makes a guess of an input signal **u**k, aiming at minimizing the robustness -<sup>M</sup>(**u**k), ϕ. It does so, learning from the previous observations **u**l, -<sup>M</sup>(**u**l), ϕ <sup>l</sup>∈[1,k−1] of input signals **<sup>u</sup>**1,..., **<sup>u</sup>**k−<sup>1</sup> and their corresponding robustness values (cf. Table 1).

The HILL-CLIMB function can be implemented by various stochastic optimization algorithms. Examples are CMA-ES [6] (used in our experiments), SA, and GNM [30].

#### **3 Our Multi-armed Bandit-Based Falsification Algorithm**

In this section, we present our contribution, namely a falsification algorithm that addresses the scale problem in Boolean superposition (see Sect. 1). The main novelties in the algorithm are as follows.


Later, in Sect. 4, we demonstrate that combining those two features gives rise to falsification algorithms that successfully cope with the scale problem in Boolean superposition.

Our algorithms focus on a fragment of STL as target specifications. They are called *(disjunctive and conjunctive) safety properties*. In Sect. 3.1 we describe this fragment of STL, and introduce necessary adaptation of the semantics. After reviewing the MAB problem in Sect. 3.2, we present our algorithms in Sects. 3.3, 3.4.


#### **3.1 Conjunctive and Disjunctive Safety Properties**

**Definition 7 (conjunctive/disjunctive safety property).** An STL formula of the form -<sup>I</sup> (ϕ<sup>1</sup> <sup>∧</sup> <sup>ϕ</sup><sup>2</sup>) is called a *conjunctive safety property*; an STL formula of the form -<sup>I</sup> (ϕ<sup>1</sup> <sup>∨</sup> <sup>ϕ</sup><sup>2</sup>) is called a *disjunctive safety property*.

It is known that, in industry practice, a majority of specifications is of the form -<sup>I</sup> (ϕ<sup>1</sup> <sup>→</sup> <sup>ϕ</sup><sup>2</sup>), where <sup>ϕ</sup><sup>1</sup> describes a trigger and <sup>ϕ</sup><sup>2</sup> describes a countermeasure that should follow. This property is equivalent to -<sup>I</sup> (¬ϕ<sup>1</sup> <sup>∨</sup> <sup>ϕ</sup><sup>2</sup>), and is therefore a disjunctive safety property.

In Sects. 3.3, 3.4, we present two falsification algorithms, for conjunctive and disjunctive safety properties respectively. For the reason we just discussed, we expect the disjunctive algorithm should be more important in real-world application scenarios. In fact, the disjunctive algorithm turns out to be more complicated, and it is best introduced as an extension of the conjunctive algorithm.

We define the restriction of robust semantics to a (sub)set of time instants. Note that we do not require S ⊆ [0, T] to be a single interval.

**Definition 8** (**<sup>w</sup>**, ψ<sup>S</sup> **, robustness restricted to** S ⊆ [0, T]). Let **<sup>w</sup>**: [0, T] <sup>→</sup> <sup>R</sup>|**Var**<sup>|</sup> be a signal, <sup>ψ</sup> be an STL formula, and S ⊆ [0, T] be a subset. We define the *robustness* of **w** under ψ *restricted to* S by

$$\|\mathbf{w}, \psi\|\_{\mathcal{S}} := \|\boldsymbol{\gamma}\_{t \in \mathcal{S}}\left[\mathbf{w}^{t}, \psi\right]. \tag{3}$$

Obviously, **<sup>w</sup>**, ψ<sup>S</sup> <sup>&</sup>lt; <sup>0</sup> implies that there exists <sup>t</sup> ∈ S such that **w**<sup>t</sup> , ψ<sup>S</sup> <sup>&</sup>lt; <sup>0</sup>. We derive the following easy lemma; it is used later in our algorithm.

**Lemma 9.** *In the setting of Definition 8, consider a disjunctive safety property* ϕ ≡ -<sup>I</sup> (ϕ<sup>1</sup> <sup>∨</sup>ϕ2)*, and let* <sup>S</sup> :={<sup>t</sup> <sup>∈</sup> <sup>I</sup> <sup>∩</sup>[0, T] <sup>|</sup> **w**<sup>t</sup> , ϕ1 <sup>&</sup>lt; 0}*. Then* **<sup>w</sup>**, ϕ2<sup>S</sup> <sup>&</sup>lt; <sup>0</sup> *implies* **w**, -<sup>I</sup> (ϕ<sup>1</sup> <sup>∨</sup> <sup>ϕ</sup>2) <sup>&</sup>lt; <sup>0</sup>*.*

#### **3.2 The Multi-Armed Bandit (MAB) Problem**

The *multi-armed bandit* (MAB) problem describes a situation where,


The best strategy of course is to keep playing the best arm Amax, i.e. the one whose average reward avg(μmax) is the greatest. This best strategy is infeasible, however, since the distributions μ1,...,μ<sup>n</sup> are initially unknown. Therefore the gambler must learn about μ1,...,μ<sup>n</sup> through attempts.

The MAB problem exemplifies the "learning by trying" paradigm of *reinforcement learning*, and is thus heavily studied. The greatest challenge is to balance between *exploration* and *exploitation*. A greedy (i.e. exploitation-only) strategy will play the arm whose empirical average reward is the maximum. However, since the rewards are random, this way the gambler can miss another arm whose real performance is even better but which is yet to be found so. Therefore one needs to mix exploration, too, occasionally trying empirically non-optimal arms, in order to identity their true performance.

The relevance of MAB to our current problem is as follows. Falsifying a conjunctive safety property -<sup>I</sup> (ϕ<sup>1</sup> <sup>∧</sup> <sup>ϕ</sup><sup>2</sup>) amounts to finding a time instant <sup>t</sup> <sup>∈</sup> <sup>I</sup> at which either <sup>ϕ</sup><sup>1</sup> or ϕ<sup>2</sup> is falsified. We can see the two subformulas (ϕ<sup>1</sup> and ϕ2) as two arms, and this constitutes an instance of the MAB problem. In particular, playing an arm translates to a falsification attempt by hill climbing, and collecting rewards translates to spending time to minimize the robustness. We show in Sects. 3.3–3.4 that this basic idea extends to disjunctive safety properties -<sup>I</sup> (ϕ<sup>1</sup> <sup>∨</sup> <sup>ϕ</sup>2), too.


A rigorous formulation of the MAB problem is presented for the record.

**Definition 10 (the multi-armed bandit problem).** The *multi-armed bandit* (MAB) problem is formulated as follows.

**Input:** arms (A1,...,A<sup>n</sup>), the associated probability distributions <sup>μ</sup>1,...,μ<sup>n</sup> over <sup>R</sup>, and a time horizon <sup>H</sup> <sup>∈</sup> <sup>N</sup> ∪ {∞}.

**Goal:** synthesize a sequence A<sup>i</sup>1A<sup>i</sup><sup>2</sup> ...A<sup>i</sup>*<sup>H</sup>* , so that the cumulative reward <sup>H</sup> <sup>k</sup>=1 rew<sup>k</sup> is maximized. Here the reward rew<sup>k</sup> of the k-th attempt is sampled from the distribution μ<sup>i</sup>*<sup>k</sup>* associated with the arm A<sup>i</sup>*<sup>k</sup>* played at the k-th attempt.

We introduce some notations for later use. Let (A<sup>i</sup><sup>1</sup> ...A<sup>i</sup>*<sup>k</sup>* ,rew<sup>1</sup> ...rew<sup>k</sup>) be a *history*, i.e. the sequence of arms played so far (here <sup>i</sup>1,...,i<sup>k</sup> <sup>∈</sup> [1, n]), and the sequence of rewards obtained by those attempts (rew<sup>l</sup> is sampled from μ<sup>i</sup>*<sup>l</sup>* ).

For an arm <sup>A</sup><sup>j</sup> , its *visit count* <sup>N</sup>(j, A<sup>i</sup>1A<sup>i</sup><sup>2</sup> ...A<sup>i</sup>*<sup>k</sup>* ,rew1rew<sup>2</sup> ...rew<sup>k</sup>) is given by the number of occurrences of A<sup>j</sup> in A<sup>i</sup>1A<sup>i</sup><sup>2</sup> ...A<sup>i</sup>*<sup>k</sup>* . Its *empirical average reward* <sup>R</sup>(j, A<sup>i</sup>1A<sup>i</sup><sup>2</sup> ...A<sup>i</sup>*<sup>k</sup>* ,rew1rew<sup>2</sup> ...rew<sup>k</sup>) is given by <sup>l</sup>∈{l∈[1,k]|i*l*=j} rewl, i.e. the average return of the arm A<sup>j</sup> in the history. When the history is obvious from the context, we simply write <sup>N</sup>(j, k) and <sup>R</sup>(j, k).

**MAB Algorithms.** There have been a number of algorithms proposed for the MAB problem; each of them gives a *strategy* (also called a *policy*) that tells which arm to play, based on the previous attempts and their rewards. The focus here is how to resolve the exploration-exploitation trade-off. Here we review two well-known algorithms.

*The* ε*-Greedy Algorithm.* This is a simple algorithm that spares a small fraction ε of chances for empirically non-optimal arms. The spared probability ε is uniformly distributed. See Algorithm 2.

*The UCB1 Algorithm.* The UCB1 (*upper confidence bound*) algorithm is more complex; it comes with a theoretical upper bound for *regrets*, i.e. the gap between the expected cumulative reward and the optimal (but infeasible) cumulative reward (i.e. the result of keep playing the optimal arm Amax). It is known that the UCB1 algorithm's regret is at most <sup>O</sup>( <sup>√</sup>nH log <sup>H</sup>) after <sup>H</sup> attempts, improving the naive random strategy (which has the expected regret <sup>O</sup>(H)).

See Algorithm 3. The algorithm is deterministic, and picks the arm that maximizes the value shown in Line 1. The first term <sup>R</sup>(j, k−1) is the *exploitation* factor, reflecting the arm's empirical performance. The second term is the *exploration* factor. Note that it is bigger if the arm A<sup>j</sup> has been played less frequently. Note also that the exploration factor eventually decays over time: the denominator grows roughly with <sup>O</sup>(k), while the numerator grows with <sup>O</sup>(ln <sup>k</sup>).

#### **Algorithm 3.** The UCB1 algorithm for multi-armed bandits

**Require:** the setting of Def. 10, and a constant c > 0 At the k-th attempt, choose the arm A<sup>i</sup>*<sup>k</sup>* as follows 1: i<sup>k</sup> ← arg max <sup>j</sup>∈[1,n] - R(j, k − 1) + c 2 ln(k−1) <sup>N</sup>(j,k−1) 2: **return** i<sup>k</sup>

#### **Algorithm 4.** Our MAB-guided algorithm I: *conjunctive* safety properties

**Require:** a system model M, an STL formula ϕ ≡ -<sup>I</sup> (ϕ<sup>1</sup> ∧ ϕ2), and a budget K 1: **function** MAB-FALSIFY-CONJ-SAFETY(M, ϕ, K) 2: rb ← ∞ ; k ← 0 rb is the smallest robustness seen so far, for either -<sup>I</sup>ϕ<sup>1</sup> or -<sup>I</sup>ϕ<sup>2</sup> 3: **while** rb ≥ 0 and k ≤ K **do** iterate if not yet falsified, and within budget 4: k ← k + 1 5: <sup>i</sup><sup>k</sup> <sup>←</sup> MAB- (ϕ1, ϕ2), R(ϕ1), R(ϕ2) , ϕ<sup>i</sup><sup>1</sup> ...ϕ<sup>i</sup>*k*−<sup>1</sup> , rew<sup>1</sup> ...rew<sup>k</sup>−<sup>1</sup> an MAB choice of i<sup>k</sup> ∈ {1, 2} for optimizing the reward R(ϕ<sup>i</sup>*<sup>k</sup>* ) 6: **<sup>u</sup>**<sup>k</sup> <sup>←</sup> HILL-CLIMB - (**u**<sup>l</sup>, rbl) <sup>l</sup>∈[1,k−1] such that <sup>i</sup>*l*=i*<sup>k</sup>* suggestion of the next input **<sup>u</sup>**<sup>k</sup> by hill climbing, based on the previous observations on the formula ϕ<sup>i</sup>*<sup>k</sup>* (those on the other formula are ignored) 7: rb<sup>k</sup> ← -<sup>M</sup>(**u**<sup>k</sup>), -<sup>I</sup>ϕ<sup>i</sup>*<sup>k</sup>* 8: **if** rb<sup>k</sup> < rb **then** rb ← rb<sup>k</sup> 9: **u** <sup>←</sup> **<sup>u</sup>**<sup>k</sup> if rb <sup>&</sup>lt; <sup>0</sup> Failure otherwise, that is, no falsifying input found within budget K 10: **return u**

**Algorithm 5.** Our MAB-guided algorithm II: *disjunctive* safety properties

**Require:** a system model M, an STL formula ϕ ≡ -<sup>I</sup> (ϕ<sup>1</sup> ∨ ϕ2), and a budget K 1: **function** MAB-FALSIFY-DISJ-SAFETY(M, ϕ, K)

The same as Algorithm 4, except that Line 7 is replaced by the following Line 7'.

7': rb<sup>k</sup> ← -<sup>M</sup>(**u**<sup>k</sup>), ϕ<sup>i</sup>*<sup>k</sup>* <sup>S</sup>*<sup>k</sup>* where <sup>S</sup><sup>k</sup> <sup>=</sup> t ∈ I ∩ [0, T] -<sup>M</sup>(**u**<sup>t</sup> <sup>k</sup>), ϕ<sup>i</sup>*<sup>k</sup>* < 0 here ϕ<sup>i</sup>*<sup>k</sup>* denotes the other formula than ϕ<sup>i</sup>*<sup>k</sup>* , among ϕ1, ϕ<sup>2</sup>

#### **3.3 Our MAB-Guided Algorithm I: Conjunctive Safety Properties**

Our first algorithm targets at conjunctive safety properties. It is based on our identification of MAB in a Boolean conjunction in falsification—this is as we discussed just above Definition 10. The technical novelty lies in the way we combine MAB algorithms and hill-climbing optimization; specifically, we introduce the notion of *hill-climbing gain* as a reward notion in MAB (Definition 11). This first algorithm paves the way to the one for disjunctive safety properties, too (Sect. 3.4).

The algorithm is in Algorithm 4. Some remarks are in order.

Algorithm 4 aims to falsify a conjunctive safety property ϕ ≡ -<sup>I</sup> (ϕ<sup>1</sup> <sup>∧</sup> <sup>ϕ</sup>2). Its overall structure is to *interleave* two sequences of falsification attempts, both of which are hill climbing-guided. These two sequences of attempts aim to falsify -<sup>I</sup>ϕ<sup>1</sup> and -<sup>I</sup>ϕ2, respectively. Note that -<sup>M</sup>(**u**), ϕ <sup>≤</sup> -<sup>M</sup>(**u**), -<sup>I</sup>ϕ1, therefore falsification of -<sup>I</sup>ϕ<sup>1</sup> implies falsification of ϕ; the same holds for -<sup>I</sup>ϕ2, too.

In Line 5 we run an MAB algorithm to decide which of -<sup>I</sup>ϕ<sup>1</sup> and -<sup>I</sup>ϕ<sup>2</sup> to target at in the k-th attempt. The function MAB takes the following as its arguments: (1) the list of arms, given by the formulas ϕ1, ϕ2; (2) their rewards <sup>R</sup>(ϕ1), <sup>R</sup>(ϕ2); (3) the history <sup>ϕ</sup><sup>i</sup><sup>1</sup> ...ϕ<sup>i</sup>*k*−<sup>1</sup> of previously played arms (i<sup>l</sup> <sup>∈</sup> {1, <sup>2</sup>}); and 4) the history rew<sup>1</sup> ...rew<sup>k</sup>−<sup>1</sup> of previously observed rewards. This way, the type of the MAB function in Line 5 matches the format in Definition 10, and thus the function can be instantiated with any MAB algorithm such as Algorithms 2–3.

The only missing piece is the definition of the rewards <sup>R</sup>(ϕ<sup>1</sup>), <sup>R</sup>(ϕ<sup>2</sup>). We introduce the following notion, tailored for combining MAB and hill climbing.

**Definition 11 (hill-climbing gain).** In Algorithm 4, in Line 5, the reward <sup>R</sup>(ϕ<sup>i</sup>) of the arm <sup>ϕ</sup><sup>i</sup> (where <sup>i</sup> ∈ {1, <sup>2</sup>}) is defined by

$$\mathcal{R}(\varphi\_i) = \begin{cases} \frac{\mathsf{m}\mathsf{a}\mathsf{x}\cdot\mathsf{rb}(i,k-1) - \mathsf{l}\mathsf{a}\mathsf{st}\cdot\mathsf{r}\mathsf{b}(i,k-1)}{\mathsf{m}\mathsf{a}\mathsf{x}\cdot\mathsf{r}\mathsf{b}(i,k-1)} & \text{if } \varphi\_i \text{ has been played before} \\ 0 & \text{otherwise} \end{cases}$$

Here max-rb(i, k <sup>−</sup> 1) := max{rb<sup>l</sup> <sup>|</sup> <sup>l</sup> <sup>∈</sup> [1, k <sup>−</sup> 1], i<sup>l</sup> <sup>=</sup> <sup>i</sup>} (i.e. the greatest rb<sup>l</sup> so far, in those attempts where <sup>ϕ</sup><sup>i</sup> was played), and last-rb(i, k <sup>−</sup> 1) :=rb<sup>l</sup>last with <sup>l</sup>last being the greatest <sup>l</sup> <sup>∈</sup> [1, k <sup>−</sup> 1] such that <sup>i</sup><sup>l</sup> <sup>=</sup> <sup>i</sup> (i.e. the last rb<sup>l</sup> for <sup>ϕ</sup>i).

Since we try to minimize the robustness values rb<sup>l</sup> through falsification attempts, we can expect that rb<sup>l</sup> for a fixed arm ϕ<sup>i</sup> decreases over time. (In the case of the hillclimbing algorithm CMA-ES that we use, this is in fact guaranteed). Therefore the value max-rb(i, k <sup>−</sup> 1) in the definition of <sup>R</sup>(ϕi) is the first observed robustness value. The numerator max-rb(i, k <sup>−</sup> 1) <sup>−</sup> last-rb(i, k <sup>−</sup> 1) then represents how much robustness we have reduced so far by hill climbing—hence the name "hill-climbing gain." The denominator max-rb(i, k <sup>−</sup> 1) is there for normalization.

In Algorithm 4, the value rb<sup>k</sup> is given by the robustness -<sup>M</sup>(**u**k), -<sup>I</sup>ϕi*<sup>k</sup>* . Therefore the MAB choice in Line 5 essentially picks i<sup>k</sup> for which hill climbing yields greater effect (but also taking exploration into account—see Sect. 3.2).

In Line 6 we conduct hill-climbing optimization—see Sect. 2.2. The function HILL-CLIMB learns from the previous attempts **u**<sup>l</sup><sup>1</sup> ,..., **u**<sup>l</sup>*<sup>m</sup>* regarding the same formula ϕ<sup>i</sup>*<sup>k</sup>* , and their resulting robustness values rb<sup>l</sup><sup>1</sup> ,...,rb<sup>l</sup>*m*. Then it suggests the next input signal **u**<sup>k</sup> that is likely to minimize the (unknown) function that underlies the correspondences **u**<sup>l</sup>*<sup>j</sup>* → rb<sup>l</sup>*<sup>j</sup>* .

j∈[1,m] Lines 6–8 read as follows: the hill-climbing algorithm suggests a single input **u**k, which is then selected or rejected (Line 8) based on the robustness value it yields (Line 7). We note that this is a simplified picture: in our implementation that uses CMA-ES (it is an evolutionary algorithm), we maintain a population of some ten particles, and each of them is moved multiple times (our choice is three times) before the best one is chosen as **u**k.

#### **3.4 Our MAB-Guided Algorithm II: Disjunctive Safety Properties**

The other main algorithm of ours aims to falsify a *disjunctive* safety property ϕ ≡ -<sup>I</sup> (ϕ<sup>1</sup> <sup>∨</sup> <sup>ϕ</sup>2). We believe this problem setting is even more important than the conjunctive case, since it encompasses conditional safety properties (i.e. of the form -<sup>I</sup> (ϕ<sup>1</sup> <sup>→</sup> <sup>ϕ</sup>2)). See Sect. 3.1 for discussions.

In the disjunctive setting, the challenge is that falsification of -<sup>I</sup>ϕ<sup>i</sup> (with <sup>i</sup> ∈ {1, <sup>2</sup>}) does *not* necessarily imply falsification of -<sup>I</sup> (ϕ<sup>1</sup> <sup>∨</sup> <sup>ϕ</sup>2). This is unlike the conjunctive setting. Therefore we need some adaptation of Algorithm 4, so that the two interleaved sequences of falsification attempts for ϕ<sup>1</sup> and ϕ<sup>2</sup> are not totally independent of each other. Our solution consists of *restricting* time instants to those where ϕ<sup>2</sup> is false, in a falsification attempt for ϕ<sup>1</sup> (and vice versa), in the way described in Definition 8.

Algorithm 5 shows our MAB-guided algorithm for falsifying a disjunctive safety property -<sup>I</sup> (ϕ<sup>1</sup> <sup>∨</sup> <sup>ϕ</sup><sup>2</sup>). The only visible difference is that Line 7 in Algorithm <sup>4</sup> is replaced with Line 7'. The new Line 7' measures the quality of the suggested input signal **u**<sup>k</sup> in the way restricted to the region S<sup>k</sup> in which the other formula is already falsified. Lemma <sup>9</sup> guarantees that, if rb<sup>k</sup> <sup>&</sup>lt; <sup>0</sup>, then indeed the input signal **<sup>u</sup>**<sup>k</sup> falsifies the original specification -<sup>I</sup> (ϕ<sup>1</sup> <sup>∨</sup> <sup>ϕ</sup><sup>2</sup>).

The assumption that makes Algorithm 5 sensible is that, although it can be hard to find a time instant at which both ϕ<sup>1</sup> and ϕ<sup>2</sup> are false (this is required in falsifying -<sup>I</sup> (ϕ<sup>1</sup> <sup>∨</sup> <sup>ϕ</sup><sup>2</sup>)), falsifying <sup>ϕ</sup><sup>1</sup> (or <sup>ϕ</sup>2) individually is not hard. Without this assumption, the region S<sup>k</sup> in Line 7' would be empty most of the time. Our experiments in Sect. 4 demonstrate that this assumption is valid in many problem instances, and that Algorithm 5 is effective.

#### **4 Experimental Evaluation**

We name MAB-UCB and MAB--greedy the two versions of MAB algorithm using strategies ε-Greedy (see Algorithm 2) and UCB1 (see Algorithm 3). We compared the proposed approach (both versions MAB-UCB and MAB--greedy) with a state-of-theart falsification framework, namely Breach [11]. Breach encapsulates several hillclimbing optimization algorithms, including *CMA-ES (covariance matrix adaptation evolution strategy)* [6], *SA (simulated annealing)*, *GNM (global Nelder-Mead)* [30], etc. According to our experience, CMA-ES outperforms other hill-climbing solvers in Breach, so the experiments for both Breach and our approach rely on the CMA-ES solver.

Experiments have been executed using Breach 1.2.13 on an Amazon EC2 c4.large instance, 2.9 GHz Intel Xeon E5-2666, 2 virtual CPU cores, 4 GB RAM.

**Benchmarks.** We selected three benchmark models from the literature, each one having different specifications. The first one is the *Automatic Transmission* (AT) model [16,24]. It has two input signals, *throttle* <sup>∈</sup> [0, 100] and *brake* <sup>∈</sup> [0, 325], and computes the car's *speed*, engine rotation in rounds per minute *rpm*, and the automatically selected *gear* . The specifications concern the relation between the three output signals to check whether the car is subject to some unexpected or unsafe behaviors. The second benchmark is the *Abstract Fuel Control* (AFC) model [16,25]. It takes two input signals, *pedal angle* <sup>∈</sup> [8.8, 90] and *engine speed* <sup>∈</sup> [900, 1100], and outputs the critical signal *air-fuel ratio* (*AF*), which influences fuel efficiency and car performance. The value is expected to be close to a reference value *AFref* ; *mu* ≡ |*AF* − *AFref* | *AFref* is the deviation of *AF* from *AFref* . The specifications check whether this property holds under both *normal mode* and *power enrichment mode*. The third benchmark is a model of a *magnetic levitation system with a NARMA-L2 neurocontroller* (NN) [7,16]. It takes one input signal, *Ref* <sup>∈</sup> [1, 3], which is the reference for the output signal *Pos*, the position of a magnet suspended above an electromagnet. The specifications say that the position should approach the reference signal in a few seconds when these two are not close.




We built the benchmark set Bbench, as shown in Table 2a that reports the name of the model and its specifications (ID and formula). In total, we found 11 specifications. In order to increase the benchmark set and obtain specifications of different complexity, we artificially modified a constant (turned into a parameter named τ if it is contained in a time interval, named ρ otherwise) of the specification: for each specification S, we generated <sup>m</sup> different versions, named as <sup>S</sup><sup>i</sup> with <sup>i</sup> ∈ {1,...,m}; the complexity of the specification (in terms of difficulty to falsify it) increases with increasing i. <sup>2</sup> In total, we produced 60 specifications. Column *parameter* in the table shows which concrete values we used for the parameters ρ and τ . Note that all the specifications but one are disjunctive safety properties (i.e., -<sup>I</sup> (ϕ<sup>1</sup> <sup>∨</sup> <sup>ϕ</sup>2)), as they are the most difficult case and they are the main target of our approach; we just add AT5 as example of conjunctive safety property (i.e., -<sup>I</sup> (ϕ<sup>1</sup> <sup>∧</sup> <sup>ϕ</sup>2)).

Our approach has been proposed with the aim of tackling the scale problem. Therefore, to better show how our approach mitigates this problem, we generated a second benchmark set Sbench as follows. We selected 15 specifications from Bbench (with concrete values for the parameters) and, for each specification S, we changed the corresponding Simulink model by multiplying one of its outputs by a factor 10<sup>k</sup>, with <sup>k</sup> ∈ {−2, 0, 1, 2, 3} (note that we also include the original one using scale factor 10<sup>0</sup>); the specification has been modified accordingly, by multiplying with the scale factor the constants that are compared with the scaled output. We name a specification S scaled with factor 10<sup>k</sup> as <sup>S</sup><sup>k</sup>. Table 2b reports the IDs of the original specifications, the output that has been scaled, and the used scaled factors; in total, the benchmark set Sbench contains 60 specifications.

**Experiment.** In our context, an *experiment* consists in the execution of an approach A (either Breach, MAB--greedy, or MAB-UCB) over a specification S for 30 *trials*, using different initial seeds. For each experiment, we record the *success* SR as the number of trials in which a falsifying input was found, and average execution *time* of the trials. Complete experimental results are reported in Appendix A in the extended version [37] 3. We report aggregated results in Table 3.

For benchmark set Bbench, it reports aggregated results for each group of specifications obtained from S (i.e., all the different versions S<sup>i</sup> obtained by changing the value of the parameter); for benchmark set Sbench, instead, results are aggregated for each scaled specification S<sup>k</sup> (considering the versions S<sup>k</sup> <sup>i</sup> obtained by changing the parameter value). We report minimum, maximum and average number of successes SR, and time in seconds. For MAB--greedy and MAB-UCB, both for SR and time, we also report the average percentage difference4 (Δ) w.r.t. to the corresponding value of Breach.

**Comparison.** In the following, we compare two approaches A1, A<sup>2</sup> ∈ {Breach, MAB--greedy, MAB-UCB} by comparing the number of their successes SR and average execution *time* using the non-parametric Wilcoxon signed-rank test with 5%

<sup>2</sup> Note that we performed this classification based on the falsification results of Breach.

<sup>3</sup> The code, models, and specifications are available online at https://github.com/ ERATOMMSD/FalStar-MAB.

<sup>4</sup> <sup>Δ</sup> <sup>=</sup> ((<sup>m</sup> <sup>−</sup> <sup>b</sup>) <sup>∗</sup> 100) (0.5 ∗ (m + b)) where m is the result of MAB and b the one of Breach.

**Table 3.** Aggregated results for benchmark sets Bbench and Sbench (SR: # successes out 30 trials. Time in secs. Δ: percentage difference w.r.t. Breach). Outperformance cases are highlighted, indicated by positive Δ of SR, and negative Δ of time.


level of significance5 [35]; the null hypothesis is that there is no difference in applying A<sup>1</sup> A<sup>2</sup> in terms of the compared measure (SR or time).

#### **4.1 Evaluation**

We evaluate the proposed approach with some research questions.

**RQ1** *Which is the best MAB algorithm for our purpose?*

In Sect. 3.2, we described that the proposed approach can be executed using two different strategies for choosing the arm in the MAB problem, namely MAB--greedy and MAB-UCB. We here assess which one is better in terms of SR and time. From the results in Table 3, it seems that MAB-UCB provides slightly better performance in terms of SR; this has been confirmed by the Wilcoxon test applied over all the experiments (i.e., on the non-aggregated data reported in Appendix A in the extended version [37]): the null hypothesis that using anyone of the two strategies has no impact on SR is rejected with p-value equal to 0.005089, and the alternative hypothesis that SR is better is accepted with p-value = 0.9975; in a similar way, the null hypothesis that there is no difference in terms of time is rejected with p-value equal to 3.495e−06, and the alternative hypothesis that is MAB-UCB is faster is accepted with p-value = 1. Therefore, in the following RQs, we compare Breach with only the MAB-UCB version of our approach.

<sup>5</sup> We checked that the distributions are not normal with the non-parametric Shapiro-Wilk test.

#### **RQ2** *Does the proposed approach effectively solve the scale problem?*

We here assess if our approach is effective in tackling the scale problem. Table 4 reports the complete experimental results over Sbench for Breach and MAB-UCB; for each specification S, all its scaled versions are reported in increasing order of the scaling factor. We observe that changing the scaling factor affects (sometimes greatly) the number of successes SR of Breach; for example, for AT5<sup>5</sup> and AT5<sup>7</sup> it goes from 30 to 0. For MAB-UCB, instead, SR is similar across the scaled versions of each specification: this shows that the approach is robust w.r.t. to the scale problem as the "hillclimbing gain" reward in Definition 11 eliminates the impact of scaling and UCB1 algorithm balances the exploration and exploitation of two sub-formulas. The observation is confirmed by the Wilcoxon test over SR: the null hypothesis is rejected with p-value = 1.808e−09, and the alternative hypothesis accepted with <sup>p</sup>-value = 1. Instead, the null hypothesis that there is no difference in terms of time cannot be rejected with p-value = 0.3294.

#### **RQ3** *How does the proposed process behave with not scaled benchmarks?*

In RQ2, we checked whether the proposed approach is able to tackle the scale problem for which it has been designed. Here, instead, we are interested in investigating how it behaves on specifications that have not been artificially scaled (i.e., those in Bbench). From Table 3 (upper part), we observe that MAB-UCB is always better than Breach both in terms of SR and time, which is shown by the highlighted cases. This is confirmed by Wilcoxon test over SR and time: null hypotheses are rejected with p-values equal to, respectively, 6.02e−08 and 1.41e−08, and the alternative hypotheses that MAB-UCB is better are both accepted


**Table 4.** Experimental results – Sbench (SR: # successes out of 30 trials. Time in secs)

with p-value = 1. This means that the proposed approach can also handle specifications that do not suffer from the scale problem, and so it can be used with any kind of specification.

#### **RQ4** *Is the proposed approach more effective than an approach based on rescaling?*

A na¨ıve solution to the scale problem could be to rescale the signals used in specification at the same scale. Thanks to the results of RQ2, we can compare to this possible baseline approach, using the scaled benchmark set Sbench. For example, AT5 suffers from the scale problem as *speed* is one order of magnitude less than *rpm*. However, from Table 3, we observe that the scaling that would be done by the baseline approach (i.e., running Breach over AT5<sup>1</sup>) is not effective, as SR is 0.4/30, that is much lower than the original SR 14.1/30 of the unscaled approach using Breach. Our approach, instead, raises SR to 28.4/30 and to 27.6/30 using the two proposed versions. By monitoring Breach execution, we notice that the na¨ıve approach fails because it tries to falsify *rpm* <sup>&</sup>lt; 4780, which, however, is not falsifiable; our approach, instead, understands that it must try to falsify *speed* < ρ. More details are given in the extended version [37].

#### **5 Conclusion and Future Work**

In this paper, we propose a solution to the *scale problem* that affects falsification of specifications containing Boolean connectives. The approach combines multi-armed bandit algorithms with hill climbing-guided falsification. Experiments show that the approach is robust under the change of scales, and it outperforms a state-of-the-art falsification tool. The approach currently handles binary specifications. As future work, we plan to generalize it to complex specifications having more than two Boolean connectives.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **StreamLAB: Stream-based Monitoring of Cyber-Physical Systems**

Peter Faymonville, Bernd Finkbeiner, Malte Schledjewski, Maximilian Schwenger(B), Marvin Stenger, Leander Tentrup, and Hazem Torfah

Reactive Systems Group, Saarland University, Saarbr¨ucken, Germany {faymonville,finkbeiner,schledjewski,schwenger, stenger,tentrup,torfah}@react.uni-saarland.de

**Abstract.** With ever increasing autonomy of cyber-physical systems, monitoring becomes an integral part for ensuring the safety of the system at runtime. StreamLAB is a monitoring framework with high degree of expressibility and strong correctness guarantees. Specifications are written in RTLola, a stream-based specification language with formal semantics. StreamLAB provides an extensive analysis of the specification, including the computation of memory consumption and run-time guarantees. We demonstrate the applicability of StreamLAB on typical monitoring tasks for cyber-physical systems, such as sensor validation and system health checks.

#### **1 Introduction**

In stream-based monitoring, we translate input streams containing data collected at runtime, such as sensor readings, into output streams containing aggregate statistics, such as an average value, a counter, or the integral of a signal. Trigger specifications define thresholds and other logical conditions on the values on these output streams, and raise an alarm or execute some other predefined action if the condition becomes true. The advantage of this setup is great expressiveness and easy-to-reuse, compositional specifications. Existing streambased languages like Lola [9,12] are based on the synchronous programming paradigm, where all streams are synchronized via a global clock. In each step, the new values of all output streams are computed in terms of the values of the other streams at a previous time step or. This paradigm provides a simple and natural evaluation model that fits well with typical implementations on

This work was partially supported by the German Research Foundation (DFG) as part of the Collaborative Research Center "Foundations of Perspicuous Software Systems" (TRR 248, 389792660), and by the European Research Council (ERC) Grant OSARES (No. 683300).

synchronous hardware. In real-time applications, however, the assumption that all data arrives synchronously is often simply not true. Consider, for example, an autonomous drone with several sensors, such as a GPS module, an inertia measurement unit, and a laser distance meter. While a synchronous arrival of all measured value would be desirable, some sensors' measurement frequency is higher than others. Moreover, the sensors do not necessarily operate on a common clock, so their readings drift apart over time.

In this paper we present the monitoring framework StreamLAB. We lift the synchronicity assumption to allow for monitoring asynchronous systems. Basis for the framework is RTLola, an extension of the steam-based runtime verification language Lola. RTLola introduces two new key concepts into Lola:


As with any semantic extension, the challenge in the design of RTLola is to maintain the efficiency of the monitoring. Obviously, not all RTLola specifications can be monitored with constant memory since the rates of the input streams are unknown, an arbitrary number of events may occur in the span of a fixed real-time unit. Thus, for aggregations such as the mean requiring to store the whole sequence of value, no amount of constant memory will always suffice. We can, nevertheless, again identify an efficiently monitorable fragment that covers many specifications of practical interest. For the space-efficient aggregation over real-time sliding windows, we partition the real-time axis into equally-sized intervals. The size of the intervals is dictated by the rate of the output streams. For certain common types of aggregations, such as the sum or the number of entries, the values within each interval can be pre-aggregated and then only stored in this summarized form. In a static analysis of the specification, we identify parts of the specification with unbounded memory consumption, and compute bounds for all other parts of the specification. In this way, we can determine early, whether a particular specification can be executed on a system with limited memory.

**Related Work.** There is a rich body of work on monitoring real-time properties. Many monitoring approaches are based on real-time variants of temporal logics [3,11,16–18,24]. Maler and Nickovic present a monitoring algorithm for properties written in signal temporal logic (STL) by reducing STL formulas via a boolean abstraction to formulas in the real-time logic MITL [21]. Building on these ideas, Donze et al. present an algorithm for the monitoring of STL properties over continuous signals [10]. The algorithm computes the robustness degree in which a piecewise-continuous signal satisfies or violates an STL formula. Towards more practical approaches, Basin et al. extend metric logics with parameterization [8]. A monitoring algorithm for the extension is implemented in the tool MonPoly [5]. MonPoly was introduced as a tool for monitoring usagecontrol policies. Another extension to metric dynamic logic was implemented in

**Fig. 1.** Illustration of the decoupled input and output using aggregations.

the tool Aerial [7]. However, most monitors generated from temporal logics are limited to Boolean verdicts.

StreamLAB uses the stream-based language RTLola as its core specification language. RTLola builds upon Lola [9,12], which is a stream-based language originally developed for monitoring synchronous hardware circuits, by adding the concepts discussed above. Stream-based monitoring languages are significantly more expressive than temporal logics. Other prominent stream-based monitoring approaches are the *Copilot* framework [23] and the tool BeepBeep 3 [15]. Copilot is a dataflow language based on several declarative stream processing languages [9,14]. From a specification in Copilot, constant space and constant time C programs implementing embedded monitors are generated. The BeepBeep 3 tool uses an SQL-like language that is defined over streams of events. In addition to stream-processing, it contains operators such as slicing, where inputs can be separated into several different traces, and windowing where aggregations over a sliding window can be computed. Unlike RTLola, BeepBeep and Copilot assume a synchronous computation model, where all events arrive at a fixed rate. Two asynchronous real-time monitoring approaches are TeSSLa [19] and Striver [13]. TeSSLa allows for monitoring piece-wise constant signals where streams can emit events at different speeds with arbitrary latencies. Neither language provides the language feature of sliding windows and the definition of fixed-rate output streams. The efficient evaluation of aggregations on sliding windows [20] has previously been studied in the context of temporal logic [4]. Basin et al. present an algorithm for combining the elements of subsequences of a sliding window with an associative operator, which reuses the results of the subsequences in the evaluation of the next window [6].

#### **2 Real-Time Lola**

RTLola extends the stream-based specification languages Lola [12] with real-time features. In the stream-based processing paradigm, sensor readings are viewed as input streams to a stream processing engine that computes outputs in form of streams on top of the values of the input streams. For example, the RTLola specification

input altitude : Float32 output tooLow := altitude < 200.0 checks whether a drone flies with an altitude less than 200 feet. For each reading of the velocity sensor, a new value for the output stream *tooLow* is computed. Streams marked with the "**trigger**"-keyword alert the user when the value of the trigger is true. In the following example, the user is warned when the drone flies below the allowed altitude.

trigger tooLow "flying below minimum altitude"

Output streams in RTLola are computed from values of the input streams, other output streams and their own past values. If we want to count the number of times the drone dives below 200 feet we can specify the stream

```
output count := (if tooLow then 1 else 0)
  + count.offset(by:-1).defaults(to:0)
```
Here, the stream *count* computes its new values by increasing its latest value by 1 in case the drone currently flies below the permitted altitude. The expression count.offset(by:-1) represents the last value of the stream. We call such expressions "lookup expressions". The default operator e.defaults(to:0) returns the value 0 in case the value of e is not defined. This can happen when a stream is evaluated the first time and looks up its last value.

In RTLola, we do not impose any assumption on the arrival frequency of input streams. Each stream can produce new values individually and at arbitrary points in time. This can lead to problems when a burst of new input values occur in a short amount of time. Subsequently, the monitor needs to evaluate all output streams, exerting a lot of pressure on the system. To prevent that, RTLola distinguishes between two kinds of outputs. *Event-based* outputs are computed whenever new input values arrive and should thus only contain inexpensive operations. All streams discussed above where event-based. In contrast to that, there are *periodic* outputs such as the following:

```
output freqDev @5Hz := altitude.aggregate (over : 200ms ,
                                            using: count) < 5
```
Here, *freqDev* will be evaluated every 200 ms as indicated by the "@ 5 Hz" label, independently of arriving input values. The stream *freqDev* does not access the event-based input *altitude* directly, but uses a *sliding window* expression to count the number of times a new value for *altitude* occurred within the last 200 ms. The value of *freqDev* represents the number of measurements the monitor received from the altimeter. Comparing this value against the expected number of readings allows for detecting deviations and thus a potentially damaged sensor.

Sliding windows allow for decoupling event-based and periodic streams, as illustrated in Fig. 1. Since the specifier has no control over the frequency of eventbased streams, these streams should be quickly evaluatable. More expensive operations, such as sliding windows, may only be used in periodic streams to increase the monitor's robustness.

#### **2.1 Examples**

In the following, we will present several interesting properties showcasing RTLola's expressivity. The specifications are simplified for illustration and thus not immediately applicable to the real-world.

*Sensor Validation.* When a sensor starts to deteriorate, it can misbehave and drop single measurements. To verify that a GPS sensor produces values at its specified frequency, in this example 10 Hz, we count the number of sensor values in a continuous window and compare it against the expected amount of events in this time frame.

```
input lat: Float32 , lon : Float32
output gps_freq@10Hz:=
  lat.aggregate( over: =1s, using: count).defaults(to:9)
trigger gps_freq < 9 "GPS sensor frequency < 9 Hz"
```
Assuming that we have another sensor measuring the true air speed, we can check whether the measured data matches the GPS data using RTLola's computation primitives. For this, we first compute the difference in longitude and latitude between the current and last measurement. The Euclidean distance provides the length of the movement vector, which can be derived discretely by dividing by the amount of time that has passed between two GPS measurements.

```
input velo : Float32
output δlon := lon - lon.offset(by:-1).defaults(to:lon)
output δlat := lat - lat.offset(by:-1).defaults(to:lat)
output gps_dist := sqrt (δlon * δlon + δlat * δlat)
output gps_velo := gps_dist
  / (time - time. offset(by:-1).defaults(to:0.0))
trigger abs (gps_velo - velo) > 0.1 "Deviating velocity"
```
When the pathfinding algorithm of the mission planner takes longer than expected, the system remains in a state without target location and thus hovers in place. Such a hover period can be detected by computing the covered distance in the last seconds. For this, we integrate the assumed velocity. We also exclude a strong headwind as a culprit for the low change in position.

```
input wnd_dir: Float32 , wnd_spd : Float32
output dir := arctan(lat/lon)
output headwind := abs (wnd_dir - dir) < 0.2
 ∧ wnd_spd > 10.0
output hovering @ 1Hz := velo. aggregate( over: 5s, using: -

                                                             )
  .defaults(to:0.5) < 0.5 ∧ ¬headwind. hold(). defaults(to:⊥)
trigger hovering "Long hover phase"
```
#### **3 Performance Guarantees via Static Analysis**

#### **3.1 Type System**

RTLola is a strongly-typed specification language. Every expression has two orthogonal types: a value type and a stream type. The *value* type is Bool, String, Int, or Float. It indicates the usual semantics of a value or expression and the amount of memory required to store the value. The *stream* type indicates when a value is evaluated. For periodic streams, the stream type defines the frequency in which it is computed. Event-based streams do not have a predetermined period. The stream type for an event-based stream identifies a set of input streams, indicating that the event-based stream is extended whenever there is, synchronously, an event on all input streams. Event-based streams may also depend on input streams not listed in the type; in such cases, the type system requires an explicit use of the 0-order *sample&hold* operator.

The type system provides runtime guarantees for the monitor: Independently of the arrival of input data, it is guaranteed that all required data is available whenever a stream is extended. Either the data was just received as input event, was computed as output stream value, or the specifier provided a default value. The type system can, thus, eliminate classes of specification problems like unintentionally accessing a slower stream from a faster stream. Whenever possible, the tool provides automatic type inference.

#### **3.2 Sliding Windows**

We use two techniques to ensure that we only need a bounded amount of memory to compute sliding windows. Meertens [22] classifies an aggregations <sup>γ</sup> : <sup>A</sup><sup>∗</sup> <sup>→</sup> <sup>B</sup> as list homomorphism if it can be split into a mapping function <sup>m</sup>: <sup>A</sup> <sup>→</sup> <sup>T</sup>, an associative reduction function <sup>r</sup> : <sup>T</sup> <sup>×</sup> <sup>T</sup> <sup>→</sup> <sup>T</sup>, a finalization function <sup>f</sup> : <sup>T</sup> <sup>→</sup> <sup>B</sup>, and a neutral element <sup>ε</sup> <sup>∈</sup> <sup>T</sup> with <sup>∀</sup><sup>t</sup> <sup>∈</sup> <sup>T</sup> : <sup>r</sup>(t, ε) = <sup>r</sup>(ε, t) = <sup>t</sup>. For these functions, rather than aggregating the whole list at once, one can apply m to each element, reduce the intermediate results with an arbitrary precedence, and finalize the result to get the same value. The second technique by Li et al. [20] divides a time interval into panes of equal size. For each pane, we aggregate all inputs and store only the fix amount of intermediate values. The type system ensures that sliding windows only occur in periodic streams so by choosing the pane size as the inverse of the frequency, paning does not change the result. In StreamLAB there are several pre-defined aggregation functions such as count, integration, summation, product, mini-, and maximization available.

#### **3.3 Memory Analysis**

StreamLAB computes the worst-case memory consumption of the specification. For this, an annotated dependency graph (ADG) is constructed where each stream s constitutes a node v*<sup>s</sup>* and whenever s accesses s , there is an edge from v*<sup>s</sup>* to v*<sup>s</sup>*-. Edges are annotated according to the type of access: if s accesses s discretely with offset n or with a sliding window aggregation of duration d and aggregation function γ, then the edge e = (v*s*, v*s*- ) is labeled with λ(e) = n or λ(e)=(d, γ), respectively. Nodes of periodic streams are now annotated with their periodicity, if stream s has period 200 ms then the node is labeled with π(v*s*) = 5 Hz. Memory bounds for discrete-time offsets can be computed as for Lola [9]. We extend this algorithm with new computational rules to determine the memory bounds for real-time expressions. For each edge e = (v, v ) in the ADG we can determine how many events of v must be stored for the computation of v using the rules in Fig. 2. Here, only γ is a list homomorphism. The strict upper bound on required memory is now the sum of the memory requirement of each individual stream. This, however, is only the amount of memory needed for storing values and does not take book-keeping data structures and the internal representation of the specification into account. Assuming reasonably small expressions (depth ≤ 64), this additional memory can be bounded with 1 kB per stream plus a flat 10 kB for working memory.


**Fig. 2.** Computation of memory bound over the dependency graph.

**Fig. 3.** Illustration of the data flow. The EM manages input events, TM schedules periodic tasks, and Eval manages the evaluation of streams.

#### **4 Processing Engine**

The processing engine consists of three components: The *EventManager (EM)* reads events from an input such Standard In or a CSV file and translates string values into the internal representation. The values are mapped to the corresponding input streams in the specification. Using a multiple-sender-single-receiver channel, the EM pushes the event on a working queue. The *TimeManager (TM)* schedules the evaluation of periodic streams. The TM computes the hyper-period of all streams and groups them by equal deadlines. Whenever a deadline is due, the corresponding streams are pushed into the working queue using the same channel as the EM. This ensures that event-based and periodic evaluation cycles occur in the correct order even under high pressure. Lastly, the *Evaluator (Eval)* manages the evaluation of streams and storage of computed values. The Eval repeatedly pops items off the working queue and evaluates the respective streams.

When monitoring a system online, the TM uses the internal system clock for scheduling tasks. When monitoring offline, however, this is no longer possible because the point in time when a stream is due to be evaluated depends on the input event. Thus, before the EM pushes an event on the working queue, it transmits the latest timestamp to the TM. The TM then decides whether some periodic streams need to be evaluated. If so, it effectively goes back in time by pushing the respective task on the working queue before acknowledging the TM. Only upon receiving the acknowledgement, the TM sends the event to the working queue. Figure 3 illustrates the information flow between the components.

#### **5 Experiments**

StreamLAB<sup>1</sup> is implemented in Rust. A major benefit of a Rust implementation is the connection to LLVM, which allows a compilation to a large variety of platforms. Moreover, the requirements to the runtime environment are as low as for C programs. This allows StreamLAB to be widely applicable.

The specifications presented in Sect. 2.1 have been tested on traces generated with the state-of-the-art flight simulator Ardupilot<sup>2</sup>. Each trace is the result of a drone flying one or more round-trips over Saarland University and provides sensor information for longitude and latitude, true air velocity, wind direction and speed, as well as the number of available GPS satellites. The longest trace consists of slightly less than 433,000 events. StreamLAB successfully detected a variety of errors such as delayed sensor readings, GPS module failures, and phases without significant movement. For an online runtime verification, the monitor reads an event of the simulator's output, processes the input data and pauses until the next event is available. Whenever necessary, periodic streams are evaluated. Online monitoring of a simulation did not allow us to exhaust the capabilities of StreamLAB because the generation of events took significantly longer than processing them. The offline monitoring function of StreamLAB allows the user to specify a delay in which consecutive events are read from a file. By gradually decreasing the delay between events until the pressure was too high, we could determine a maximum input frequency of 647.2 kHz. When disabling the delay and running the monitor at maximum speed, StreamLAB processes a trace of length 432,961 in 0.67 s, so each event takes 1545 ns to process while three threads utilized 146% of CPU. In terms of memory, the maximum resident set size amounted to 16 MB. This includes bookkeeping data structures, the specification, evaluator code, and parts of the C standard library. While the evaluation does not require any heap allocation after the setup phase, the average stack size amounts to less than 1kB. The experiment was conducted on 3.3 GHz Intel Core i7 processor with 16 GB2133 MHz LPDDR3 RAM.

<sup>1</sup> www.stream-lab.org.

<sup>2</sup> ardupilot.org.

#### **6 Outlook**

The stream-based monitoring framework StreamLAB demonstrates the applicability of stream monitoring for cyber-physical systems. Previous versions of Lola have successfully been applied to networks and unmanned aircraft systems in cooperation the with German Aerospace Center DLR [1,2,12]. StreamLAB provides a modular, easy-to-understand specification language and design-time feedback for specifiers. This helps to improve the development process for cyberphysical systems. Coupled with the promising experimental results, this lays the foundation for further applications of the framework on real-world systems.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### VERIFAI**: A Toolkit for the Formal Design and Analysis of Artificial Intelligence-Based Systems**

Tommaso Dreossi(B) , Daniel J. Fremont(B) , Shromona Ghosh(B) , Edward Kim, Hadi Ravanbakhsh, Marcell Vazquez-Chanlatte, and Sanjit A. Seshia(B)

University of California, Berkeley, USA *{*tommasodreossi,dfremont,shromona.ghosh*}*@berkeley.edu, sseshia@eecs.berkeley.edu

**Abstract.** We present VERIFAI, a software toolkit for the formal design and analysis of systems that include artificial intelligence (AI) and machine learning (ML) components. VERIFAI particularly addresses challenges with applying formal methods to ML components such as perception systems based on deep neural networks, as well as systems containing them, and to model and analyze system behavior in the presence of environment uncertainty. We describe the initial version of VERIFAI, which centers on simulation-based verification and synthesis, guided by formal models and specifications. We give examples of several use cases, including temporal-logic falsification, model-based systematic fuzz testing, parameter synthesis, counterexample analysis, and data set augmentation.

**Keywords:** Formal methods *·* Falsification *·* Simulation *·* Cyber-physical systems *·* Machine learning *·* Artificial intelligence *·* Autonomous vehicles

#### **1 Introduction**

The increasing use of artificial intelligence (AI) and machine learning (ML) in systems, including safety-critical systems, has brought with it a pressing need for formal methods and tools for their design and verification. However, AI/ML-based systems, such as autonomous vehicles, have certain characteristics that make the application of formal methods very challenging. We mention three key challenges here; see Seshia et al. [23] for an in-depth discussion. First, several uses of AI/ML are for *perception*, the use of computational systems to mimic human perceptual tasks such as object recognition and classification, conversing in natural language, etc. For such perception components,

This work was supported in part by NSF grants 1545126 (VeHICaL), 1646208, 1739816, and 1837132, the DARPA BRASS program under agreement number FA8750-16-C0043, the DARPA Assured Autonomy program, the iCyPhy center, and Berkeley Deep Drive. NVIDIA Corporation donated the Titan Xp GPU used for this research.

T. Dreossi, D. J. Fremont, S. Ghosh—These authors contributed equally to the paper.

writing a formal specification is extremely difficult, if not impossible. Additionally, the signals processed by such components can be very high-dimensional, such as streams of images or LiDAR data. Second, *machine learning* being a dominant paradigm in AI, formal tools must be compatible with the data-driven design flow for ML and also be able to handle the complex, high-dimensional structures in ML components such as deep neural networks. Third, the *environments* in which AI/ML-based systems operate can be very complex, with considerable uncertainty even about how many (which) agents are in the environment (both human and robotic), let alone about their intentions and behaviors. As an example, consider the difficulty in modeling urban traffic environments in which an autonomous car must operate. Indeed, AI/ML is often introduced into these systems precisely to deal with such complexity and uncertainty! From a formal methods perspective, this makes it very hard to create realistic environment models with respect to which one can perform verification or synthesis.

In this paper, we introduce the VERIFAI toolkit, our initial attempt to address the three core challenges—perception, learning, and environments—that are outlined above. VERIFAI takes the following approach:


VERIFAI is currently focused on AI-based cyber-physical systems (CPS), although its basic ideas can also be applied to other AI-based systems. As a pragmatic choice, we focus on simulation-based verification, where the simulator is treated as a black-box, so as to be broadly applicable to the range of simulators used in industry.1 The input to

<sup>1</sup> Our work is complementary to the work on industrial-grade simulators for AI/ML-based CPS. In particular, VERIFAI enhances such simulators by providing formal methods for modeling (via the SCENIC language), analysis (via temporal logic falsification), and parameter synthesis (via property-directed hyper/model-parameter synthesis).

VERIFAI is a "closed-loop" CPS model, comprising a composition of the AI-based CPS system under verification with an environment model, and a property on the closed-loop model. The AI-based CPS typically comprises a perception component (not necessarily based on ML), a planner/controller, and the plant (i.e., the system under control). Given these, VERIFAI offers the following use cases: (1) temporal-logic falsification; (2) model-based fuzz testing; (3) counterexample-guided data augmentation; (4) counterexample (error table) analysis; (5) hyper-parameter synthesis, and (6) model parameter synthesis. The novelty of VERIFAI is that it is the first tool to offer this suite of use cases in an integrated fashion, unified by a common representation of an abstract feature space, with an accompanying modeling language and search algorithms over this feature space, all provided in a modular implementation. The algorithms and formalisms in VERIFAI are presented in papers published by the authors in other venues (e.g., [7– 10,12,15,22]). The problem of temporal-logic falsification or simulation-based verification of CPS models is well studied and several tools exist (e.g. [3,11]); our work was the first to extend these techniques to CPS models with ML components [7,8]. Work on verification of ML components, especially neural networks (e.g., [14,26]), is complementary to the system-level analysis performed by VERIFAI. Fuzz testing based on formal models is common in software engineering (e.g. [16]) but our work is unique in the CPS context. Similarly, property-directed parameter synthesis has also been studied in the formal methods/CPS community, but our work is the first to apply these ideas to the synthesis of hyper-parameters for ML training and ML model parameters. Finally, to our knowledge, our work on augmenting training/test data sets [9], implemented in VERIFAI, is the first use of formal techniques for this purpose. In Sect. 2, we describe how the tool is structured so as to provide the above features. Sect. 3 illustrates the use cases via examples from the domain of autonomous driving.

### **2** VERIFAI **Structure and Operation**

VERIFAI is currently focused on simulation-based analysis and design of AI components for perception or control, potentially those using ML, in the context of a closedloop cyber-physical system. Figure 1 depicts the structure and operation of the toolkit.

**Inputs and Outputs:** Using VERIFAI requires setting up a simulator for the domain of interest. As we explain in Sect. 3, we have experimented with multiple robotics simulators and provide an easy interface to connect a new simulator. The user then constructs the inputs to VERIFAI, including (i) a simulatable model of the system, including code for one or more controllers and perception components, and a dynamical model of the system being controlled; (ii) a probabilistic model of the environment, specifying constraints on the workspace, the locations of agents and objects, and the dynamical behavior of agents, and (iii) a property over the composition of the system and its environment. VERIFAI is implemented in Python for interoperability with ML/AI libraries and simulators across platforms. The code for the controller and perception component can be arbitrary executable code, invoked by the simulator. The environment model typically comprises a definition in the simulator of the different types of agents, plus a description of their initial conditions and other parameters using the SCENIC probabilistic programming language [12]. Finally, the property to be checked can be expressed

**Fig. 1.** Structure and operation of VERIFAI.

using Metric Temporal Logic (MTL) [2,24], objective functions, or arbitrary code monitoring the property. The output of VERIFAI depends on the feature being invoked. For falsification, VERIFAI returns one or more *counterexamples*, simulation traces violating the property [7]. For fuzz testing, VERIFAI produces traces sampled from the distribution of behaviors induced by the probabilistic environment model [12]. Error table analysis involves collecting counterexamples generated by the falsifier into a table, on which we perform analysis to identify features that are correlated with property failures. Data augmentation uses falsification and error table analysis to generate additional data for training and testing an ML component [9]. Finally, the property-driven synthesis of model parameters or hyper-parameters generates as output a parameter evaluation that satisfies the specified property.

**Tool Structure:** VERIFAI is composed of four main modules, as described below:

*• Abstract Feature Space and* SCENIC *Modeling Language:* The abstract feature space is a compact representation of the possible configurations of the simulation. Abstract features can represent parameters of the environment, controllers, or of ML components. For example, when analyzing a visual perception system for an autonomous car, an abstract feature space could consist of the initial poses and types of all vehicles on the road. Note that this abstract space, compared to the concrete feature space of pixels used as input to the controller, is better suited to the analysis of the overall closed-loop system (e.g. finding conditions under which the car might crash).

VERIFAI provides two ways to construct abstract feature spaces. They can be constructed hierarchically, combining basic domains such as hyperboxes and finite sets into structures and arrays. For example, we could define a space for a car as a structure combining a 2D box for position with a 1D box for heading, and then create an array of these to get a space for several cars. Alternatively, VERIFAI allows a feature space to be defined using a program in the SCENIC language [12]. SCENIC provides convenient syntax for describing geometric configurations and agent parameters, and, as a probabilistic programming language, allows placing a distribution over the feature space which can be conditioned by declarative constraints.


The communication between VERIFAI and the simulator is implemented in a clientserver fashion using IPv4 sockets, where VERIFAI sends configurations to the simulator which then returns trajectories (traces). This architecture allows easy interfacing to a simulator and even with multiple simulators at the same time.

#### **3 Features and Case Studies**

This section illustrates the main features of VERIFAI through case studies demonstrating its various use cases and simulator interfaces. Specifically, we demonstrate model falsification and fuzz testing of an autonomous vehicle (AV) controller, data augmentation and error table analysis for a convolutional neural network, and model and hyperparameter tuning for a reinforcement learning-based controller.

#### **3.1 Falsification and Fuzz Testing**

VERIFAI offers a convenient way to debug systems through systematic testing. Given a model and a specification, the tool can use active sampling to automatically search for inputs driving the model towards a violation of the specification. VERIFAI can also perform model-based fuzz testing, exploring random variations of a scenario guided by formal constraints. To demonstrate falsification and fuzz testing, we consider two scenarios involving AVs simulated with the robotics simulator Webots [25]. For the experiments reported here, we used Webots 2018 which is commercial software.

In the first example, we falsify the controller of an AV which is responsible for safely maneuvering around a disabled car and traffic cones which are blocking the road. We implemented a hybrid controller which relies on perception modules for state estimation. Initially, the car follows its lane using standard computer vision (non-ML) techniques for line detection [20]. At the same time, a neural network (based on squeezeDet [27]) estimates the distance to the cones. When the distance drops below 15 m, the car performs a lane change, afterward switching back to lane-following.

The correctness of the AV is characterized by an MTL formula requiring the vehicle to maintain a minimum distance from the traffic cones and avoid overshoot while changing lanes. The task of the falsifier is to find small perturbations of the initial scene (generated by SCENIC) which cause the vehicle to violate this specification. We allowed perturbations of the initial positions and orientations of all objects, the color of the disabled car, and the cruising speed and reaction time of the ego car.

Our experiments showed that active samplers driven by the robustness of the MTL specification can efficiently discover scenes that confuse the controller and yield faulty behavior. Figure 2 shows an example, where the neural network detected the orange car instead of the traffic cones, causing the lane change to be initiated too early. As a result, the controller performed only an incomplete lane change, leading to a crash.

**Fig. 2.** A falsifying scene automatically discovered by VERIFAI. The neural network misclassifies the traffic cones because of the orange vehicle in the background, leading to a crash. Left: bird'seye view. Right: dash-cam view, as processed by the neural network.

In our second experiment, we used VERIFAI to simulate variations on an actual accident involving an AV [5]. The AV, proceeding straight through an intersection, was hit by a human turning left. Neither car was able to see the other because of two lanes of stopped traffic. Figure 3 shows a (simplified) SCENIC program we wrote to reproduce

**Fig. 3.** Left: Partial SCENIC program for the crash scenario. Car is an object class defined in the Webots world model (not shown), on is a SCENIC *specifier* positioning the object uniformly at random in the given region (e.g. the median line of a lane), (-0.5, 0.5) indicates a uniform distribution over that interval, and X@Y creates a vector with the given coordinates (see [12] for a complete description of SCENIC syntax). Right: (1) initial scene sampled from the program; (2) the red car begins its turn, unable to see the green car; (3) the resulting collision. (Color figure online)

the accident, allowing variation in the initial positions of the cars. We then ran simulations from random initial conditions sampled from the program, with the turning car using a controller trying to follow the ideal left-turn trajectory computed from Open-StreetMap data using the Intelligent Intersections Toolbox [17]. The car going straight used a controller which either maintained a constant velocity or began emergency breaking in response to a message from a simulated "smart intersection" warning about the turning car. By sampling variations on the initial conditions, we could determine how much advance notice is necessary for such a system to robustly avoid an accident.

#### **3.2 Data Augmentation and Error Table Analysis**

Data augmentation is the process of supplementing training sets with the goal of improving the performance of ML models. Typically, datasets are augmented with transformed versions of preexisting training examples. In [9], we showed that augmentation with counterexamples is also an effective method for model improvement.

**Fig. 4.** This image generated by our renderer was misclassified by the NN. The network reported detecting only one car when there were two.

VERIFAI implements a counterexample-guided augmentation scheme, where a falsifier (see Sect. 3.1) generates misclassified data points that are then used to augment the original training set. The user can choose among different sampling methods, with passive samplers suited to generating diverse sets of data points while active samplers can efficiently generate similar counterexamples. In addition to the counterexamples themselves, VERIFAI also returns an error table aggregating information on the misclassifications that can be used to drive the retraining process. Figure 4 shows the rendering of a misclassified sample generated by our falsifier.

For our experiments, we implemented a renderer that generates images of road scenarios and tested the quality of our augmentation scheme on the squeezeDet convolutional neural network [27], trained for classification. We adopted three techniques to select augmentation images: (1) randomly sampling from the error table, (2) selecting the top *k*-closest (similar) samples from the error table, and (3) using PCA analysis to generate new samples. For details on the renderer and the results of counterexampledriven augmentation, see [9]. We show that incorporating the generated counterexamples during re-training improves the accuracy of the network.

#### **3.3 Model Robustness and Hyperparameter Tuning**

In this final section, we demonstrate how VERIFAI can be used to tune test parameters and hyperparameters of AI systems. For the following case studies, we use OpenAI Gym [4], a framework for experimenting with reinforcement learning algorithms.

First, we consider the problem of testing the robustness of a learned controller for a cart-pole, i.e., a cart that balances an inverted pendulum. We trained a neural network to control the cart-pole using Proximal Policy Optimization algorithms [21] with 100k training episodes. We then used VERIFAI to test the robustness of the learned controller, varying the initial lateral position and rotation of the cart as well as the mass and length of the pole. Even for apparently robust controllers, VERIFAI was able to discover configurations for which the cart-pole failed to self-balance. Figure 5 shows 1000 iterations of the falsifier, where sampling was guided by the reward function used by OpenAI to train the controller. This function provides a negative reward if the cart moves more than 2.4 m or if at any time the angle maintained by the pole is greater than 12◦. For testing, we slightly modified these thresholds.

**Fig. 5.** The green dots represent model parameters for which the cart-pole controller behaved correctly, while the red dots indicate specification violations. Out of 1000 randomly-sampled model parameters, the controller failed to satisfy the specification 38 times. (Color figure online)

Finally, we used VERIFAI to study the effects of hyperparameters when training a neural network controller for a mountain car. In this case, the controller must learn to exploit momentum in order to climb a steep hill. Here, rather than searching for counterexamples, we look for a set of hyperparameters under which the network *correctly* learns to control the car. Specifically, we explored the effects of using different training algorithms (from a discrete set of choices) and the size of the training set. We used the VERIFAI falsifier to search the hyperparameter space, guided again by the reward function provided by OpenAI Gym (here the distance from the goal position), but negated so that falsification implied finding a controller which successfully climbs the hill. In this way VERIFAI built a table of safe hyperparameters. PCA analysis then revealed which hyperparameters the training process is most sensitive or robust to.

#### **4 Conclusion**

We presented VERIFAI, a toolkit for the formal design and analysis of AI/ML-based systems. Our implementation, plus the examples described in Sect. 3, are available in the tool distribution [1], including detailed instructions and expected output.

In future work, we plan to explore additional applications of VERIFAI, and to expand its functionality with new algorithms. Towards the former, we have already interfaced VERIFAI to the CARLA driving simulator [6], for more sophisticated experiments with autonomous cars, as well as to the X-Plane flight simulator [19], for testing an ML-based aircraft navigation system. More broadly, although our focus has been on CPS, we note that VERIFAI's architecture is applicable to other types of systems. Finally, for extending VERIFAI itself, we plan to move beyond directed simulation by incorporating symbolic methods, such as those used in finding adversarial examples.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **The Marabou Framework for Verification and Analysis of Deep Neural Networks**

Guy Katz1(B), Derek A. Huang<sup>2</sup>, Duligur Ibeling<sup>2</sup>, Kyle Julian<sup>2</sup>, Christopher Lazarus<sup>2</sup>, Rachel Lim<sup>2</sup>, Parth Shah<sup>2</sup>, Shantanu Thakoor<sup>2</sup>, Haoze Wu<sup>2</sup>, Aleksandar Zelji´c<sup>2</sup>, David L. Dill<sup>2</sup>, Mykel J. Kochenderfer<sup>2</sup>, and Clark Barrett<sup>2</sup>

<sup>1</sup> The Hebrew University of Jerusalem, Jerusalem, Israel guykatz@cs.huji.ac.il <sup>2</sup> Stanford University, Stanford, USA *{*huangda,duligur,kjulian3,clazarus,parth95,thakoor, haozewu,zeljic,dill,mykel,clarkbarrett*}*@stanford.edu, rachelim@cs.stanford.edu

**Abstract.** Deep neural networks are revolutionizing the way complex systems are designed. Consequently, there is a pressing need for tools and techniques for network analysis and certification. To help in addressing that need, we present *Marabou*, a framework for verifying deep neural networks. Marabou is an SMT-based tool that can answer queries about a network's properties by transforming these queries into constraint satisfaction problems. It can accommodate networks with different activation functions and topologies, and it performs high-level reasoning on the network that can curtail the search space and improve performance. It also supports parallel execution to further enhance scalability. Marabou accepts multiple input formats, including protocol buffer files generated by the popular TensorFlow framework for neural networks. We describe the system architecture and main components, evaluate the technique and discuss ongoing work.

#### **1 Introduction**

Recent years have brought about a major change in the way complex systems are being developed. Instead of spending long hours hand-crafting complex software, many engineers now opt to use *deep neural networks* (*DNNs*) [6,19]. DNNs are machine learning models, created by training algorithms that generalize from a finite set of examples to previously unseen inputs. Their performance can often surpass that of manually created software as demonstrated in fields such as image classification [16], speech recognition [8], and game playing [21].

Despite their overall success, the opacity of DNNs is a cause for concern, and there is an urgent need for certification procedures that can provide rigorous guarantees about network behavior. The formal methods community has

taken initial steps in this direction, by developing algorithms and tools for neural network verification [5,9,10,12,18,20,23,24]. A DNN verification query consists of two parts: (i) a neural network, and (ii) a property to be checked; and its result is either a formal guarantee that the network satisfies the property, or a concrete input for which the property is violated (a counter-example). A verification query can encode the fact, e.g., that a network is robust to small adversarial perturbations in its input [22].

A neural network is comprised of *neurons*, organized in layers. The network is evaluated by assigning values to the neurons in the input layer, and then using these values to iteratively compute the assignments of neurons in each succeeding layer. Finally, the values of neurons in the last layer are computed, and this is the network's output. A neuron's assignment is determined by computing a weighted sum of the assignments of neurons from the preceding layer, and then applying to the result a non-linear activation function, such as the Rectified Linear Unit (ReLU) function, ReLU(x) = max (0, x). Thus, a network can be regarded as a set of *linear constraints* (the weighted sums), and a set of *non-linear constraints* (the activation functions). In addition to a neural network, a verification query includes a property to be checked, which is given in the form of linear or nonlinear constraints on the network's inputs and outputs. The verification problem thus reduces to finding an assignment of neuron values that satisfies all the constraints simultaneously, or determining that no such assignment exists.

This paper presents a new tool for DNN verification and analysis, called *Marabou*. The Marabou project builds upon our previous work on the Reluplex project [2,7,12,13,15,17], which focused on applying SMT-based techniques to the verification of DNNs. Marabou follows the Reluplex spirit in that it applies an SMT-based, *lazy search* technique: it iteratively searches for an assignment that satisfies all given constraints, but treats the non-linear constraints lazily in the hope that many of them will prove irrelevant to the property under consideration, and will not need to be addressed at all. In addition to search, Marabou performs deduction aimed at learning new facts about the non-linear constraints in order to simplify them.

The Marabou framework is a significant improvement over its predecessor, Reluplex. Specifically, it includes the following enhancements and modifications:


core was tailored for a smooth integration with the Marabou framework and eliminates much of the overhead in Reluplex due to the use of GLPK.


Marabou is available online [14] under the permissive modified BSD license.

**Fig. 1.** The main components of Marabou.

#### **2 Design of Marabou**

Marabou regards each neuron in the network as a variable and searches for a variable assignment that simultaneously satisfies the query's linear constraints and non-linear constraints. At any given point, Marabou maintains the current variable assignment, lower and upper bounds for every variable, and the set of current constraints. In each iteration, it then changes the variable assignment in order to (1) correct a violated linear constraint, or (2) correct a violated non-linear constraint.

The Marabou verification procedure is sound and complete, i.e. the aforementioned loop eventually terminates. This can be shown via a straightforward extension of the soundness and completeness proof for Reluplex [12]. However, in order to guarantee termination, Marabou only supports activation functions that are piecewise-linear. The tool already has built-in support for the ReLU function and the Max function max (x1,...,x*n*), and it is modular in the sense that additional piecewise-linear functions can be added easily.

Another important aspect of Marabou's verification strategy is deduction specifically, the derivation of tighter lower and upper variable bounds. The motivation is that such bounds may transform piecewise-linear constraints into linear constraints, by restricting them to one of their linear segments. To achieve this, Marabou repeatedly examines linear and non-linear constraints, and also performs network-level reasoning, with the goal of discovering tighter variable bounds.

Next, we describe Marabou's main components (see also Fig. 1).

#### **2.1 Simplex Core (***Tableau* **and** *BasisFactorization* **Classes)**

The simplex core is the part of the system responsible for making the variable assignment satisfy the linear constraints. It does so by implementing a variant of the *simplex algorithm* [3]. In each iteration, it changes the assignment of some variable x, and consequently the assignment of any variable y that is connected to x by a linear equation. Selecting x and determining its new assignment is performed using standard algorithms—specifically, the *revised simplex method* in which the various linear constraints are kept in implicit matrix form, and the steepest-edge and Harris' ratio test strategies for variable selection.

Creating an efficient simplex solver is complicated. In Reluplex, we delegated the linear constraints to an external solver, GLPK. Our motivation for implementing a new custom solver in Marabou was twofold: first, we observed in Reluplex that the repeated translation of queries into GLPK and extraction of results from GLPK was a limiting factor on performance; and second, a black box simplex solver did not afford the flexibility we needed in the context of DNN verification. For example, in a standard simplex solver, variable assignments are typically pressed against their upper or lower bounds, whereas in the context of a DNN, other assignments might be needed to satisfy the non-linear constraints. Another example is the deduction capability, which is crucial for efficiently verifying a DNN and whose effectiveness might depend on the internal state of the simplex solver.

#### **2.2 Piecewise-Linear Constraints (***PiecewiseLinearConstraint* **Class)**

Throughout its execution, Marabou maintains a set of piecewise-linear constraints that represent the DNN's non-linear functions. In iterations devoted to satisfying these constraints, Marabou looks for any constraints that are not satisfied by the current assignment. If such a constraint is found, Marabou changes the assignment in a way that makes that constraint satisfied. Alternatively, in order to guarantee eventual termination, if Marabou detects that a certain constraint is repeatedly not satisfied, it may perform a *case-split* on that constraint: a process in which the piecewise-linear constraint ϕ is replaced by an equivalent disjunction of linear constraints c<sup>1</sup> ∨ ... ∨ c*n*. Marabou considers these disjuncts one at a time and checks for satisfiability. If the problem is satisfiable when ϕ is replaced by some c*i*, then the original problem is also satisfiable; otherwise, the original problem is unsatisfiable.

In our implementation, piecewise-linear constraints are represented by objects of classes that inherit from the *PiecewiseLinearConstraint* abstract class. Currently the two supported instances are ReLU and Max, but the design is modular in the sense that new constraint types can easily be added. *PiecewiseLinearConstraint* defines the interface methods that each supported piecewise-linear constraint needs to implement. Some of the key interface methods are:


#### **2.3 Constraint- and Network-Level Reasoning (***RowBoundTightener***,** *ConstraintBoundTightener* **and** *SymbolicBoundTightener* **Classes)**

Effective deduction of tighter variable bounds is crucial for Marabou's performance. Deduction is performed at the constraint level, by repeatedly examining linear and piecewise-linear constraints to see if they imply tighter variable bounds; and also at the DNN-level, by leveraging the network's topology.

Constraint-level bound tightening is performed by querying the piecewiselinear constraints for tighter bounds using the *getEntailedTightenings()* method. Similarly, linear equations can also be used to deduce tighter bounds. For example, the equation x = y + z and lower bounds x ≥ 0, y ≥ 1 and z ≥ 1 together imply the tighter bound x ≥ 2. As part of the simplex-based search, Marabou repeatedly encounters many linear equations and uses them for bound tightening.

Several recent papers have proposed verification schemes that rely on DNNlevel reasoning [5,23]. Marabou supports this kind of reasoning as well, by storing the initial network topology and performing deduction steps that use this information as part of its iterative search. DNN-level reasoning is seamlessly integrated into the search procedure by (1) initializing the DNN-level reasoners with the most up-to-date information discovered during the search, such as variable bounds and the state of piecewise-linear constraints; and (2) feeding any new information that is discovered back into the search procedure. Presently Marabou implements a symbolic bound tightening procedure [23]: based on network topology, upper and lower bounds for each hidden neuron are expressed as a linear combination of the input neurons. Then, if the bounds on the input neurons are sufficiently tight (e.g., as a result of past deductions), these expressions for upper and lower bounds may imply that some of the hidden neurons' piecewise-linear activation functions are now restricted to one of their linear segments. Implementing additional DNN-level reasoning operations is work in progress.

#### **2.4 The Engine (***Engine* **and** *SmtCore* **Classes)**

The main class of Marabou, in which the main loop resides, is called the *Engine*. The engine stores and coordinates the various solution components, including the simplex core and the piecewise-linear constraints. The main loop consists, roughly, of the following steps (the first rule that applies is used):


The engine also triggers deduction steps, both at the neuron level and at the network level, according to various heuristics.

#### **2.5 The Divide-and-Conquer Mode and Concurrency (***DnC.py***)**

Marabou supports a *divide-and-conquer* (*D*&*C* ) solving mode, in which the input region specified in the original query is partitioned into sub-regions. The desired property is checked on these sub-regions independently. The D&C mode naturally lends itself to parallel execution, by having each sub-query checked on a separate node. Moreover, the D&C mode can improve Marabou's overall performance even when running sequentially: the total time of solving the subqueries is often less than the time of solving the original query, as the smaller input regions allow for more effective deduction steps.

Given a query φ, the solver maintains a queue Q of query, timeout pairs. Q is initialized with one element φ, T, where T, the initial timeout, is a configurable parameter. To solve φ, the solver loops through the following steps:


The timeout factor m and the splitting factor k are configurable parameters. Splitting the query's input region is performed heuristically.

#### **2.6 Input Interfaces (***AcasParser* **class,** *maraboupy* **Folder)**

Marabou supports verification queries provided through the following interfaces:


#### **3 Evaluation**

For our evaluation we used the ACAS Xu [12], CollisionDetection [4] and TwinStream [1] families of benchmarks. Tool-wise, we considered the Reluplex tool which is the most closely related to Marabou, and also ReluVal [23] and Planet [4]. The version of Marabou used for the evaluation is available online [14].

The top left plot in Fig. 3 compares the execution times of Marabou and Reluplex on 180 ACAS Xu benchmarks with a 1 hour timeout. We used Marabou in D&C mode with 4 cores and with T = 5, k = 4, and m = 1.5. The remaining three plots depict an execution time comparison between Marabou D&C (configuration as above), ReluVal and Planet, using 4 cores and a 1 hour timeout. Marabou and Reluval are evaluated over 180 ACAS Xu benchmarks (top right plot), and Marabou and Planet are evaluated on those 180 benchmarks (bottom left plot) and also on 500 CollisionDetection and 81 TwinStream benchmarks (bottom right plot). Due to technical difficulties, ReluVal was not run on the CollisionDetection and TwinStream benchmarks. The results show that in a 4 cores setting Marabou generally outperforms Planet, but generally does not outperform ReluVal (though it does better on some benchmarks). These results highlight the need for additional DNN-level reasoning in Marabou, which is a key ingredient in ReluVal's verification procedure.

Figure 2 shows the average runtime of Marabou and ReluVal on the ACAS Xu properties, as a function of the number of available cores. We see that as the number of cores increases, Marabou (solid) is able to close the gap, and sometimes outperform, ReluVal (dotted). With 64 cores, Marabou outperforms ReluVal on average, and both solvers were able to solve all ACAS Xu benchmarks within 2 hours (except for a few segfaults by ReluVal).

**Fig. 2.** A scalability comparison of Marabou and ReluVal on ACAS Xu.

**Fig. 3.** A comparison of Marabou with Reluplex, ReluVal and Planet.

#### **4 Conclusion**

DNN analysis is an emerging field, and Marabou is a step towards a more mature, stable verification platform. Moving forward, we plan to improve Marabou in several dimensions. Part of our motivation in implementing a custom simplex solver was to obtain the needed flexibility for fusing together the solving process for linear and non-linear constraints. Currently, this flexibility has not been leveraged much, as these pieces are solved relatively separately. We expect that by tackling both kinds of constraints simultaneously, we will be able to improve performance significantly. Other enhancements we wish to add include: additional networklevel reasoning techniques based on abstract interpretation; better heuristics for both the linear and non-linear constraint solving engines; and additional engineering improvements, specifically within the simplex engine.

**Acknowledgements.** We thank Elazar Cohen, Justin Gottschlich, and Lindsey Kuper for their contributions to this project. The project was partially supported by grants from the Binational Science Foundation (2017662), the Defense Advanced Research Projects Agency (FA8750-18-C-0099), the Federal Aviation Administration, Ford Motor Company, Intel Corporation, the Israel Science Foundation (683/18), the National Science Foundation (1814369, DGE-1656518), Siemens Corporation, and the Stanford CURIS program.

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Probabilistic Systems, Runtime Techniques

### **Probabilistic Bisimulation for Parameterized Systems (with Applications to Verifying Anonymous Protocols)**

Chih-Duo Hong1(B) , Anthony W. Lin2(B) , Rupak Majumdar3(B) ,

and Philipp Rummer ¨ 4(B)

<sup>1</sup> Oxford University, Oxford, UK chihduo.hong@gmail.com <sup>2</sup> TU Kaiserslautern, Kaiserslautern, Germany lin@cs.uni-kl.de <sup>3</sup> Max Planck Institute for Software Systems, Kaiserslautern, Germany rupak@mpi-sws.org <sup>4</sup> Uppsala University, Uppsala, Sweden philipp.ruemmer@it.uu.se

**Abstract.** Probabilistic bisimulation is a fundamental notion of process equivalence for probabilistic systems. It has important applications, including the formalisation of the anonymity property of several communication protocols. While there is a large body of work on verifying probabilistic bisimulation for finite systems, the problem is in general undecidable for parameterized systems, i.e., for infinite families of finite systems with an arbitrary number n of processes. In this paper we provide a general framework for reasoning about probabilistic bisimulation for parameterized systems. Our approach is in the spirit of software verification, wherein we encode proof rules for probabilistic bisimulation and use a decidable first-order theory to specify systems and candidate bisimulation relations, which can then be checked automatically against the proof rules.

We work in the framework of regular model checking, and specify an infinitestate system as a regular relation described by a first-order formula over a universal automatic structure, i.e., a logical theory over the string domain. For probabilistic systems, we show how probability values (as well as the required operations) can be encoded naturally in the logic. Our main result is that one can specify the verification condition of whether a given regular binary relation is a probabilistic bisimulation as a regular relation. Since the first-order theory of the universal automatic structure is decidable, we obtain an effective method for verifying probabilistic bisimulation for infinite-state systems, given a regular relation as a candidate proof. As a case study, we show that our framework is sufficiently expressive for proving the anonymity property of the parameterized dining cryptographers protocol and the parameterized grades protocol. Both of these protocols hitherto could not be verified by existing automatic methods.

This research was sponsored in part by the ERC Starting Grant 759969 (AV-SMP), ERC Synergy project 610150 (ImPACT), the DFG project 389792660-TRR 248 (Perspicuous Computing), the Swedish Research Council (VR) under grant 2018-04727, and by the Swedish Foundation for Strategic Research (SSF) under the project WebSec (Ref. RIT17-0011).

Moreover, with the help of standard automata learning algorithms, we show that the candidate relations can be synthesized fully automatically, making the verification fully automated.

#### **1 Introduction**

Equivalence checking using bisimulation relations plays a fundamental role in formal verification. Bisimulation is the basis for substitutability of systems: if two systems are bisimilar, their behaviors are the same and they satisfy the same formulas in expressive temporal logics. The notion of bisimulation is defined both for deterministic [39] and for probabilistic transition systems [34]. In both contexts, checking bisimulation has many applications, such as proving correctness of anonymous communication protocols [15], reasoning about knowledge [22], program optimization [32], and optimizations for computational problems (e.g. language equivalence and minimization) of finite automata [12].

The problem of checking bisimilarity of two given systems has been widely studied. It is decidable in polynomial-time for both probabilistic and non-probabilistic *finitestate* systems [6,17,20,52]. These algorithms form the basis of practical tools for checking bisimulation. For infinite-state systems, such as parameterized versions of communication protocols (i.e. infinite families of finite-state systems with an arbitrary number n of processes), the problem is undecidable in general. Most research hitherto has focused on identifying decidable subcases (e.g. strong bisimulations for pushdown systems for probabilistic and non-probabilistic cases [25,47,48]), rather than on providing tool support for practical problems.

In this paper, we propose a first-order verification approach—inspired by software verification techniques—for reasoning about bisimilarity for infinite-state systems. In our approach, we provide first-order logic *proof rules* to determine if a given binary relation is a bisimulation. To this end, we must find an *encoding* of systems and relations and a *decidable first-order theory* that can formalize the system, the property, and the proof rules. We propose to use the decidable first-order theory of the *universal automatic structure* [8,10]. Informally, the domain of the theory is a set of words over a finite alphabet Σ, and it captures the first-order theory of the infinite |Σ|-ary tree with a relation that relates strings of the same level. The theory can express precisely the class of all *regular relations* [8] (a.k.a. automatic relations [10]), which are relations ϕ(x1,...,xk) over strings Σ<sup>∗</sup> that can be recognized by synchronous multi-tape automata. It is also sufficiently powerful to capture many classes of non-probabilistic infinite-state systems and regular model checking [3,13,49–51].

We demonstrate the effectiveness of the approach by encoding and automatically verifying some challenging examples from the literature of parameterized systems in our logic: the anonymity property of the parameterized dining cryptographers protocol [16] and the grades protocol [29]. These examples were only automatically verified for some fixed parameters using finite-state model checkers or equivalence checkers (e.g. see [28,29]). Just as invariant verification for software separates out the proof rules (verification conditions in a decidable logic) from the synthesis of invariants, we separate out proof rules for bisimulation from the synthesis of bisimulation relations. We demonstrate how recent developments in generating and refining candidate proofs as automata (e.g. [18,26,27,37,38,40,41,53]) can be used to automate the search of proofs, making our verification fully "push button."

**Contributions.** Our contributions are as follows. First, we show how probabilistic infinite-state systems can be faithfully encoded in the first-order theory of universal automatic structure. In the past, the theory has been used to reason about qualitative liveness of weakly-finite MDPs (e.g. see [36,37]), which allows the authors to disregard the actual non-zero probability values. To the best of our knowledge, no encoding of probabilistic transition systems in the theory was available. In order to be able to effectively encode probabilistic systems, our theory should typically be two-sorted: one sort for encoding the configurations, and the other for encoding the probability values. We show how both sorts (and the operations required for the sorts) can be encoded in the universal automatic structure, which requires only the domain of strings. In the sequel, such transition systems will be called *regular transition systems*.

Second, using the *minimal probability assumption* on the transition systems [34] (i.e. there exists an ε > 0 such that any non-zero transition probability is at least ε) which is often satisfied in practice—we show how the verification condition of whether a given regular binary relation is a probabilistic bisimulation can be encoded in the theory. The decidability of the first-order theory over the universal automatic structure gives us an effective means of checking probabilistic bisimulation for regular transition systems. In fact, the theory can be easily reduced to the weak monadic theory WS1S of one successor (therefore, allowing highly optimized tools like Mona [31] and Gaston [23]) by interpreting finite words as finite sets (e.g. see [19,46]).

Our framework requires the encoding of the systems and the proofs in the first-order theory of the universal automatic structure. Which interesting examples can it capture? Our third contribution is to provide two examples from the literature of parameterized verification: the anonymity property of the parameterized dining cryptographers protocol [16] and of the parameterized grades protocol [29]. We study two versions of dining cryptographers protocol in this paper: the classical version where the secrets are single bits, and a generalized version where the secrets are bit-vectors of arbitrary length.

Thus far, our framework requires a candidate proof to be supplied by the user. Our final contribution is to demonstrate how standard techniques from the synthesis literature (e.g. automata learning [18,26,27,37,38,40,41,53]) can be used to fully automate the proof search. Using automata learning, we successfully pinpoint regular proofs for the anonymity property of the three protocols: the two dining cryptographers protocols are verified in 6 and 28 s, respectively, and the grades protocol in 35 s.

**Other Related Work.** The verification framework we use in this paper can be construed as a regular model checking [3] framework using regular relations. The framework uses first-order logic as the language, which makes it convenient to express many verification conditions (as is well-known from first-order theorem proving [14]). The use of the universal automatic structure allows us to express two different sorts (configurations and probability values) in one sort (i.e. strings). Most work in regular model checking focuses on safety and liveness properties (e.g. [2,3,11,13,27,36,37,40,42,49,51,53]).

Some automated techniques can prove the anonymity property of the dining cryptographers protocol and the grades protocol in the finite case, e.g., the PRISM model checker [28,45] and language equivalence by the tool APEX [29]. To the best of our knowledge, our method is the first automated technique proving the anonymity property of the protocols in the parameterized case.

Our work is in spirit of deductive software verification (e.g., [4,14,24,35,43,44]), where one provides inductive invariants manually, and a tool automatically checks correctness of the candidate invariants. In theory, our result yields a fully-automatic procedure by enumerating all candidate regular proofs, and at the same time enumerating all candidate counterexamples (note that we avoid undecidability by restricting attention to proofs encodable as regular relations). In our implementation, we use recent advances in automata-learning based synthesis to efficiently encode the search [18,37].

### **2 Preliminaries**

**General Notation.** We use <sup>N</sup> to denote non-negative integers. Given a, b <sup>∈</sup> <sup>R</sup>, we use a standard notation [a, b] := {<sup>c</sup> <sup>∈</sup> <sup>R</sup> : <sup>a</sup> <sup>≤</sup> <sup>c</sup> <sup>≤</sup> <sup>b</sup>} to denote real intervals. Given a set S, we use S<sup>∗</sup> to denote the set of all finite sequences of elements from S. The set S<sup>∗</sup> always includes the empty sequence which we denote by ε. We call a function f : S → [0, 1] a *probability distribution over* S if - <sup>s</sup>∈<sup>S</sup> <sup>f</sup>(s)=1. We shall use I<sup>s</sup> to denote the probability distribution f with f(s)=1, and D<sup>S</sup> to denote the set of probability distributions over S. Given a function f : X<sup>1</sup> ×··· × X<sup>n</sup> → Y , the *graph* of f is the relation {(x1, ..., xn, f(x1, ..., xn)) : ∀i ∈ {1,...,n}. x<sup>i</sup> ∈ Xi}. Whenever a relation R is an equivalence relation over set S, we use S/R to denote the set of equivalence classes created by R. Depending on the context, we may use pRq or R(p, q) to denote (p, q) ∈ R.

**Words and Automata.** We assume basic familiarity with word automata. Fix a finite alphabet Σ. For each finite word w := w<sup>1</sup> ...w<sup>n</sup> ∈ Σ∗, we write w[i, j], where 1 ≤ i ≤ j ≤ n, to denote the segment w<sup>i</sup> ...w<sup>j</sup> . Given an automaton A := (Σ, Q, δ, q0, F), a run of A on w is a function ρ : {0,...,n} → Q with ρ(0) = q<sup>0</sup> that obeys the transition relation δ. We may also denote the run ρ by the word ρ(0)··· ρ(n) over the alphabet Q. The run ρ is said to be *accepting* if ρ(n) ∈ F, in which case we say that the word w is *accepted* by A. The language L(A) of A is the set of words in Σ<sup>∗</sup> accepted by A.

**Transition Systems.** We fix a set ACT of *action symbols*. A *transition system* over ACT is a tuple S := S; {→a}<sup>a</sup>∈ACT, where S is a set of *configurations* and →<sup>a</sup> ⊆ S × S is a binary relation over S. We use → to denote the relation <sup>a</sup>∈ACT <sup>→</sup>a. We say that a sequence s<sup>1</sup> → ··· → sn+1 is a *path* in S if s1, ..., sn+1 ∈ S and s<sup>i</sup> → si+1 for i ∈ {1,...,n}. A transition system is called *bounded branching* if the number of configurations reachable from a configuration in one step is bounded. Formally, this means that there exists an *a priori* integer N such that for all s ∈ S, |{s ∈ S : s → s }| ≤ N.

**Probabilistic Transition Systems.** A *probabilistic transition system* (*PTS*) [34] is a structure S := S; {δa}<sup>a</sup>∈ACT where S is a set of configurations and δ<sup>a</sup> : S → D<sup>S</sup> ∪ {0} maps each configuration to either a probability distribution or a zero function 0. Here δa(s) = 0 simply means that s is a "dead end" for action a. We shall use δa(s, s ) to denote δa(s)(s ). In this paper, we always assume that δa(s, s ) is a rational number and |{s : δa(s, s ) = 0}| < ∞. The *underlying transition graph* of a PTS is a transition system S; {→a}a∈ACT such that s →<sup>a</sup> s iff δa(s, s ) = 0.

It is standard (e.g. see [34]) to impose the *minimal probability assumption* on the PTS that we shall be dealing with, i.e., there is > 0 such that any transition with a non-zero probability p satisfies p>. This assumption is practically sensible since it is satisfied by most PTSs that we deal with in practice (e.g. finite PTS, probabilistic pushdown automata [21], and most examples from probabilistic parameterized systems [36,37] including our examples from Sect. 5). The minimal probability assumption, among others, implies that the PTS is bounded-branching (i.e. that its underlying transition graph is bounded-branching). In the sequel, we shall adopt this assumption.

**Probabilistic Bisimulations.** Let S := S; {δa}<sup>a</sup>∈ACT be a PTS. We write s ρ −→<sup>a</sup> S if - s-∈S δa(s, s ) = ρ. A *probabilistic bisimulation* for S is an equivalence relation R over S, such that (p, q) ∈ R implies

$$\forall a \in \mathsf{ACT}. \forall S' \in S/R. \left(p \xrightarrow{\rho}\_{a} S' \Leftrightarrow q \xrightarrow{\rho}\_{a} S'\right). \tag{1}$$

We say that p and q are *probabilistic bisimilar* (written as p ∼ q) if there is a probabilistic bisimulation R such that (p, q) ∈ R. We can compute probabilistic bisimulation between two PTSs S := S; {δa}<sup>a</sup>∈ACT and S := S ; {δ <sup>a</sup>}<sup>a</sup>∈ACT by computing a probabilistic bisimulation R for the disjoint union of S and S , which is defined as S S := S S ; {δ <sup>a</sup> }<sup>a</sup>∈ACT where δ <sup>a</sup> (s) := δa(s) for s ∈ S, and δ <sup>a</sup> (s) := δ <sup>a</sup>(s) for s ∈ S . In such case, we say R is a probabilistic bisimulation between S and S .

#### **3 Framework of Regular Relations**

In this section we describe the framework of regular relations for specifying probabilistic infinite-state systems, properties to verify, and proofs, all in a uniform symbolic way. The framework is amenable to automata-theoretic algorithms in the spirit of *regular model checking* [3,13].

The framework of *regular relations* [8] (a.k.a. *automatic relations* [9]) uses the firstorder theory of universal<sup>1</sup> automatic structure

$$\mathfrak{U} := \langle \Sigma^\*; \preceq, \mathsf{eqL}, \{l\_a\}\_{a \in \Sigma} \rangle,\tag{2}$$

where Σ is some finite alphabet, is the (non-strict) prefix-of relation, eqL is the binary equal length predicate, and l<sup>a</sup> is a unary predicate asserting that the last letter of the word is a. The domain of the structure is the set of finite words over Σ, and for words w, w ∈ Σ∗, we have w w iff there is some w ∈ Σ<sup>∗</sup> such that w · w = w , eqL(w, w ) iff |w| = |w |, and la(w) iff there is some w ∈ Σ<sup>∗</sup> such that w = w · a.

Next, we discuss the expressive power of first-order formulas over the universal automatic structures, and decision procedures for satisfiability of such formulas. In Sect. 4, we shall describe: (1) how to specify a PTS as a first-order formula in U , and (2) how to specify the verification condition for probabilistic bisimulation property in this theory. In Sect. 5, we shall show that the theory is sufficiently powerful for capturing probabilistic bisimulations for interesting examples.

<sup>1</sup> Here, "universal" simply means that all automatic structures are first-order interpretable in this structure.

*Expressiveness and Decidability.* The name "regular" associated with this framework is because the set of formulas ϕ(x1,...,xk) first-order definable in U coincides with *regular relations*, i.e., relations definable by synchronous automata. More precisely, we define [[ϕ]] as the relation which contains all tuples (w1,...,wk) ∈ (Σ<sup>∗</sup> <sup>⊥</sup>)<sup>k</sup> such that U |= ϕ(w1,...,wk). In addition, we define the *convolution* w<sup>1</sup> ⊗···⊗ w<sup>k</sup> of words <sup>w</sup>1,...,w<sup>k</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> as a word <sup>w</sup> over <sup>Σ</sup><sup>k</sup> <sup>⊥</sup> (where <sup>⊥</sup> <sup>∈</sup>/ <sup>Σ</sup>) such that <sup>w</sup>[i]=(a1,...,ak) with

$$a\_j = \begin{cases} w\_j[i] & \text{if } |w\_j| \ge i, \text{ or} \\ \bot & \text{otherwise.} \end{cases}$$

In other words, w is obtained by juxtaposing w1,...,w<sup>k</sup> and padding the shorter words with ⊥. For example, 010 ⊗ 00 = (0, 0)(1, 0)(0, ⊥). A k-ary relation R over Σ<sup>∗</sup> is *regular* if the set {w<sup>1</sup> ⊗···⊗ w<sup>k</sup> : (w1,...,wk) ∈ R} is a regular language over the alphabet Σ<sup>k</sup> <sup>⊥</sup>. The relationship between <sup>U</sup> and regular relations can be formally stated as follows.

#### **Proposition 1 (**[8–10]**).**


The decidability of the first-order theory of U follows using a standard automatatheoretic algorithm (e.g. see [9,49]).

In the sequel, we shall also use the term regular relations to denote relations definable in U . In addition, to avoid notational clutter, we shall freely use other regular relations (e.g. successor relation ≺succ of the prefix , and membership in a regular language) as syntactic sugar.

We note that the first-order theory of U can also be reduced to weak monadic theory WS1S of one successor (therefore, allowing highly optimized tools like MONA [31] and Gaston [23]) by translating finite words to finite sets. The relationship between the universal automatic structure and WS1S can be made precise using the notion of *finite-set interpretations* [19,46].

#### **4 Probabilistic Bisimilarity Within Regular Relations**

In this section, we show how the framework of regular relations can be used to encode a PTS, and the corresponding proof rules for probabilistic bisimulation.

#### **4.1 Specifying a Probabilistic Transition System**

Since we assume that all probability values specified in our systems are rational numbers, the fact that our PTS is bounded-branching implies that we can specify the probability values by natural *weights* (by multiplying the probability values by the least common multiple of the denominators). For example, if a configuration c has an action toss that takes it to c<sup>1</sup> and c2, each with probability 1/2, then the new system simply changes both values of 1/2 to 1. This is a known trick in the literature of probabilistic verification (e.g. see [1]). Therefore, we can now assume that the transition probability functions have range N. *The challenge now is that our encoding of a PTS in the universal automatic structure must encode two different sorts* as words over a finite alphabet Σ: configurations and natural weights.

Now we are ready to show how to specify a PTS S in our framework. Fix a finite alphabet Σ containing at least two letters 0 and 1. We encode the domain of S as words over <sup>Σ</sup>. In addition, a natural weight <sup>n</sup> <sup>∈</sup> <sup>N</sup> can be encoded in the usual way as a binary string. This motivates the following definition.

**Definition 1.** *Let* S *be a PTS* S; {δa}<sup>a</sup>∈ACT*. We say that* S *is regular if the domain* S *is a regular subset of* Σ<sup>∗</sup> *(i.e. definable by a first-order formula* ϕ(x) *with one free variable over* U *), and if the graph of each function* δ<sup>a</sup> *is a ternary regular relation (i.e. definable by a first-order formula* ϕ(x, y, z) *over* U *, where* x *and* y *encode configurations, and* z *encodes a natural weight).*

Definition 1 is quite general since it allows for an infinite number of different natural weights in the PTS. Note that we can make do without the second sort (of numeric weights) if we have only finitely many numeric weights n1,...,nm. This can be achieved by specifying a regular relation Ra,i for each action label a ∈ ACT and numeric weight n<sup>i</sup> with i ∈ {1,...,m}.

*Example 1.* We show a regular encoding of a very simple PTS: a random walk on the set of natural numbers. At each position x, the system can non-deterministically choose to loop or to move. If the system chooses to loop, it will stay at the same position with probability 1. If the system chooses to move, it will move to x + 1 with probability 1/4, or move to max(0, x − 1) with probability 3/4. Normalising the probability values by multiplying by 4, we obtain the numeric weights of 4, 1, and 3 for the aforementioned transitions, respectively.

To represent the system by regular relations, we encode the positions in unary and the numeric weights in binary. The set of configurations is the regular language 1∗. The graph of the transition probability function can be described by a first-order formula ϕ(x, y, z) := ϕloop(x, y, z) ∨ ϕmove(x, y, z) over U , where

ϕloop(x, y, z) := x ∈ 1<sup>∗</sup> ∧ y ∈ 1<sup>∗</sup> ∧ ((x = y ∧ z = 100) ∨ (x = y ∧ z = 0)) ; ϕmove(x, y, z) := x ∈ 1<sup>∗</sup> ∧ y ∈ 1<sup>∗</sup> ∧ ((x ≺succ y ∧ z = 1) ∨ (y ≺succ x ∧ z = 11) ∨ (x = ε ∧ y = ε ∧ z = 11) ∨ (¬(x ≺succ y) ∧ ¬(y ≺succ x) ∧ ¬(x = ε ∧ y = ε) ∧ z = 0)). 

*Example 2.* As a second example, consider a PTS (from [25], Example 1) described by a probabilistic pushdown automaton with states Q = {p, q, r} and stack symbols Γ = {X, X ,Y,Z}. There is a unique action a, and the transition rules δ<sup>a</sup> are as follows:

$$\begin{array}{llll}pX \stackrel{0.5}{\longrightarrow} qXX & pX \stackrel{0.5}{\longrightarrow} p & qX \stackrel{1}{\longrightarrow} pXX & rY \stackrel{1}{\longrightarrow} rXX\\rX \stackrel{0.3}{\longrightarrow} rYX & rX \stackrel{0.2}{\longrightarrow} rYX' & rX \stackrel{0.5}{\longrightarrow} r\\rX' \stackrel{0.4}{\longrightarrow} rYX & rX' \stackrel{0.1}{\longrightarrow} rYX' & rX' \stackrel{0.5}{\longrightarrow} r\\\end{array}$$

A configuration of the PTS is a word in QΓ∗, consisting of a state in Q and a word over the stack symbols. A transition can be applied if the prefix of the configuration matches the left hand side of the transition rules above. We encode the PTS as follows: the set of configurations is QΓ∗, the weights are represented in binary after normalization, and the transition relation ϕ(x, y, z) encodes the transition rules in disjunction. For example, the disjunct corresponding to the rule pX <sup>0</sup>.<sup>5</sup> −−→ qXX is

$$x \in Q\Gamma^\* \land y \in Q\Gamma^\* \land (\exists u.\ x = pXu \land y = qXXu) \land z = 101...$$

Note that the PTS is bounded branching with a bound 3.

#### **4.2 Proof Rules for Probabilistic Bisimulation**

Fix the set ACT of action symbols and the branching bound N ≥ 1, owing to the minimal probability assumption. Consider a two-sorted vocabulary σ = {Pa}<sup>a</sup>∈ACT, R, +, where P<sup>a</sup> is a ternary relation (with the first two arguments over the first sort, and the third argument over the second sort of natural numbers), R is a binary relation over the first sort, and + is the addition function over the second sort of natural numbers. The main result we shall show next is summarized in the following theorem:

**Theorem 1.** *There is a fixed first-order formula* Φ *over* σ *such that a binary relation* R *is a probabilistic bisimulation over a bounded-branching PTS* S = S; {δa}<sup>a</sup>∈ACT *iff* (S, R) |= Φ*. Furthermore, when* S *is a regular PTS and* R *is a regular relation, we can compute in polynomial time a first-order formula* Φ *over* U *such that* R *is a probabilistic bisimulation over* S *iff* U |= Φ *.*

This theorem implies the following result:

**Theorem 2.** *Given a regular relation* E ⊆ Σ<sup>∗</sup> ×Σ<sup>∗</sup> *and a bounded-branching regular PTS* S = S; {δa}<sup>a</sup>∈ACT*, there exists an algorithm that either finds* (u, v) ∈ E *which are not probabilistically bisimilar or finds a regular probabilistic bisimulation relation* R *over* S *such that* E ⊆ R *if one exists. The algorithm does not terminate iff* E *is contained in some probabilistic bisimulation relation but every probabilistic bisimulation* R *containing* E *is not regular.*

Note that when verifying parameterized systems we are typically interested in checking bisimilarity over *a set of pairs* (instead of just one pair) of configurations, and hence E in the above statement.

*Proof of Theorem 2.* To prove this, we provide two semi-algorithms, one for checking the existence of R and the other for showing that a pair (v, w) ∈ E is a witness for non-bisimilarity.

By Theorem 1, we can enumerate all possible candidate regular relation R and effectively check that R is a probabilistic bisimulation over S. The condition that E ⊆ R is a first-order property, and so can be checked effectively.

To show non-bisimilarity is recursively enumerable, observe that if we fix (v, w) ∈ E and a number d, then the restrictions S<sup>v</sup> and S<sup>w</sup> to configurations that are of distance

at most d away from v and w (respectively) are finite PTS. Therefore, we can devise a semi-algorithm which enumerates all (v, w) ∈ E, and all probabilistic modal logic (PML) formulas [34] F over ACT containing only rational numbers (i.e. a formula of the form aμF , where μ ∈ [0, 1] is a rational number, which is sufficient because we assume only rational numbers in the PTS). We need to check that Sv, v |= F, but Sw, w - F. Model checking PML formulas over finite systems is decidable (in fact, the logic is subsumed by Probabilistic CTL [7]), which makes our check effective.

#### **4.3 Proof of Theorem 1**

In the rest of the section, we shall give a proof of Theorem 1. Given a binary relation R ⊆ S × S, we can write a first-order formula F*eq* (R) for checking that R is an equivalence relation:

$$(\forall s, t, u \in S.R(s, s) \land (R(s, t) \Rightarrow R(t, s)) \land ((R(s, t) \land R(t, u) \Rightarrow R(s, u))))$$

We shall next define a formula ϕa(p, q) for each a ∈ ACT, such that R is a probabilistic bisimulation for S = S; {δa}<sup>a</sup>∈ACT iff (S, R) |= Φ(R), where

$$\Phi(R) := F\_{eq}(R) \land \forall p, q \in S. \; R(p, q) \Rightarrow \bigwedge\_{a \in \mathsf{ACT}} (\psi\_a(p) \land \psi\_a(q)) \lor \varphi\_a(p, q). \; (3)$$

The formula ψa(s) := ∀s ∈ S. δa(s, s )=0 states that configuration s cannot move to any configuration through action a.

Before we describe ϕa(p, q), we provide some intuition and define some intermediate macros. Fix configurations p and q. Informally, ϕa(p, q) will first guess a set of configurations u1,...,u<sup>N</sup> containing the successors of p on action a, and a set of configurations v1,...,v<sup>N</sup> containing the successors of q on action a. Second, it will guess labellings α1,...,α<sup>N</sup> and β1,...,β<sup>N</sup> which correspond to partitionings of the configurations u1,...,u<sup>N</sup> and v1,...,v<sup>N</sup> , respectively. The intuition is that the α's and β's "name" the partitions: if α<sup>i</sup> = α<sup>j</sup> (resp. β<sup>i</sup> = β<sup>j</sup> ), then u<sup>i</sup> and u<sup>j</sup> (resp. v<sup>i</sup> and v<sup>j</sup> ) are guessed to be in the same partition. The formula then checks that the guessed partitioning is compatible with the equivalence relation R (i.e. if the labelling claims u<sup>i</sup> and u<sup>j</sup> are in the same partition, then indeed R(ui, u<sup>j</sup> ) holds), and that the probability masses of the partitions assigned by configurations p and q satisfy the constraint given in (1).

For the first part, we define a formula

$$\begin{aligned} \mathsf{succ}\_a(w; u\_1, \dots, u\_N) &:= \left(\bigwedge\_{1 \le i < j \le N} u\_i \neq u\_j\right) \wedge \\ \left(\forall u \in S. \,\,\delta\_a(w, u) \neq 0 \Rightarrow \bigvee\_{1 \le i \le N} u = u\_i\right), \end{aligned}$$

stating that the successors of configuration w on action a are among the N distinct configurations u1,...,u<sup>N</sup> . Note that a configuration may have fewer than N successors. In this case, we can set the rest of the variables to arbitrary distinct configurations.

For the second part, we shall check that R is compatible with the guessed partitions, and that configurations p and q assign the same probability mass to the same partition. Let k1,...,k<sup>n</sup> be a labelling for configurations s1,...,sn. To check that the partitioning induced by the labelling is compatible with R, we need to express the condition that k<sup>i</sup> = k<sup>j</sup> if and only if R(si, s<sup>j</sup> ) holds. To this end, we define a formula

$$\mathsf{compat}\_R(s\_1, \ldots, s\_n; k\_1, \ldots, k\_n) := \bigwedge\_{1 \le i < j \le n} (R(s\_i, s\_j) \Leftrightarrow k\_i = k\_j) \dots$$

Now, we are ready to define ϕa(p, q):

$$\varphi\_a(p, q) := \exists u\_1, \dots, u\_N, v\_1, \dots, v\_N \in S.\exists \alpha\_1, \dots, \alpha\_N, \beta\_1, \dots, \beta\_N \in \mathbb{N}.$$

$$\mathsf{succ}\_a(p; u\_1, \dots, u\_N) \land \mathsf{succ}\_a(q; v\_1, \dots, v\_N) \land \tag{4}$$

$$\mathsf{commat}\_R(u\_1, \dots, u\_N, v\_1, \dots, v\_N; \alpha\_1, \dots, \alpha\_N, \beta\_1, \dots, \beta\_N) \land$$

$$\forall k \in \mathbb{N}. \left(\sum\_{i: \, \alpha\_i = k} \delta\_a(p, u\_i) = \sum\_{i: \, \beta\_i = k} \delta\_a(q, v\_i)\right).$$

With this definition, ϕa(p, q) holds if and only if p ρ −→<sup>a</sup> S ⇔ q ρ −→<sup>a</sup> S holds for any ρ ≥ 0 and equivalence class S ∈ S/R.

*Example 3.* Consider the PTS from Example 2. The configurations pXZ and rX are probabilistic bisimilar. This can be seen using a probabilistic bisimulation relation with equivalence classes {pX<sup>k</sup>Z}∪{rw : <sup>w</sup> ∈ {X, X }<sup>k</sup>} for all <sup>k</sup> <sup>≥</sup> <sup>0</sup> and {qX<sup>k</sup>+1Z} ∪ {rY w : w ∈ {X, X }<sup>k</sup>} for all <sup>k</sup> <sup>≥</sup> <sup>1</sup>. The probabilistic bisimulation relation is definable as the symmetric closure of a regular relation R, where (w1, w2) ∈ R iff

$$\begin{aligned} &(w\_1 = w\_2) \lor \\ &(w\_1 \in pX^\*Z \land w\_2 \in r(X + X')^\* \bot \land |w\_1| = |w\_2|) \lor \\ &(w\_1 \in r(X + X')^\* \land w\_2 \in r(X + X')^\* \land |w\_1| = |w\_2|) \lor \\ &(w\_1 \in qX^\*Z \land w\_2 \in rY(X + X')^\* \bot \land |w\_1| = |w\_2|) \lor \\ &(w\_1 \in rY(X + X')^\* \land w\_2 \in rY(X + X')^\* \land |w\_1| = |w\_2|). \end{aligned}$$

For this example, the formula (3) simplifies to F*eq* (R) ∧ ∀s, t ∈ S. ϕa(p, q) for the unique action a. This formula defines a condition that checks the bisimulation relation for all states symbolically. To see the formula in action, fix configurations pXZ and rX which are probabilistic bisimilar. In the PTS, pXZ has two successors, qXXZ and pZ, each with probability 0.5, and rX has three successors, rY X with probability 0.3, rY X with probability 0.2, and r with probability 0.5. In the formula for ϕa(p, q), we can set the successors u<sup>i</sup> of pXZ and the successors v<sup>j</sup> of rX as above (the third "successor" u<sup>3</sup> is set to an arbitrary configuration not reachable from pXZ), and set α<sup>1</sup> = 1, α<sup>2</sup> = 2, β<sup>1</sup> = β<sup>2</sup> = 1, and β<sup>3</sup> = 2, corresponding to the equivalence classes of the bisimulation relation. One can check that the probability masses to these classes are the same.

We remark that the first-order theory of U is sufficient to encode any probabilistic pushdown automaton, not just this example.

We proceed to show that if R and δ<sup>a</sup> are first-order definable over U then so are ψ<sup>a</sup> and ϕa. Suppose that δ<sup>a</sup> is encoded using the ternary relation δa(x, y, z), as stated in the previous section. (We shall re-use the symbol δ here to avoid a clash of names).

We define <sup>ψ</sup>a(s) := <sup>∀</sup>s <sup>∈</sup> S. <sup>∀</sup><sup>z</sup> <sup>∈</sup> <sup>N</sup>. δa(s, s , z) ⇔ z = 0. To define ϕa, the key point is to express the sum of transition probabilities in the logic. We use the fact that addition of integers in binary encoding is regular (see e.g. [9]), and write a formula that performs iterated addition. Formally, for each a ∈ ACT we define a formula χ<sup>a</sup> such that

$$\begin{aligned} \chi\_a(u; u\_1, \dots, u\_N; \alpha\_1, \dots, \alpha\_N; k; z) &:= \\ \exists z\_1, \dots, z\_{N+1} \in \mathbb{N}. \; z\_1 = 0 \land z\_{N+1} = z \land \bigwedge\_{1 \le i \le N} \chi'\_a(u, u\_i, \alpha\_i, k, z\_i, z\_{i+1}), \end{aligned}$$

where

$$\chi\_a'(u, u', \kappa, k, x, y) := (\kappa = k \land \exists z. \, \delta\_a(u, u', z) \land y = x + z) \lor (\kappa \neq k \land y = x)$$

performs a single addition—we use the fact that addition "y = x + z" in binary is encodable as a regular relation—and z1,...,zN+1 store the intermediate sums. Hence, given <sup>k</sup> <sup>∈</sup> <sup>N</sup>, <sup>u</sup>1,...,u<sup>N</sup> , v1,...,v<sup>N</sup> <sup>∈</sup> <sup>S</sup>, and <sup>α</sup>1,...,α<sup>N</sup> , β1,...,β<sup>N</sup> <sup>∈</sup> <sup>N</sup>,

$$\sum\_{\substack{i\colon\ \alpha\_i=k}} \delta\_a(p, u\_i) = \sum\_{\substack{i\colon\ \beta\_i=k}} \delta\_a(q, v\_i)$$

if and only if

$$\exists z \in \mathbb{N}. \chi\_a(p; u\_1, \dots, u\_N; \alpha\_1, \dots, \alpha\_N; k; z) \land \chi\_a(q; v\_1, \dots, v\_N; \beta\_1, \dots, \beta\_N; k; z).$$

It follows that ϕa(p, q) defined in (4) can be encoded in the first-order theory of U .

*Remark.* Note that checking the validity of a given presentation of a regular PTS is algorithmic. To see this, suppose we are given a set of formulae {δa(x, y, z)}<sup>a</sup>∈ACT that is claimed to encode the probabilistic transition functions of a PTS with a branching bound N. Fix a formula δa. First, we need to check that for all x ∈ S, there are at most N distinct y's such that δa(x, y, z) satisfies z = 0. Second, we need to check that [[δa]] is a function, i.e., ∀x, y. ∃!z. δa(x, y, z), where ∃!z. ϕ(¯x, z) is a shorthand for the formula asserting there exists precisely one z such that ϕ(¯x, z) is true. Third, we need to check that [[δa]] encodes a mapping S → {0}∪DS. The first two requirements are easily seen to be expressible as a first-order formula and hence is algorithmic over U . The third requirement amounts to checking the assertion that there exists <sup>w</sup><sup>a</sup> <sup>∈</sup> <sup>N</sup> satisfying

$$\begin{split} \forall x \in S. \left( \forall y \in S. \forall z \in \mathbb{N}. \delta\_a(x, y, z) \Leftrightarrow z = 0 \right) \lor \\ \qquad \left( \exists y\_1, \dots, y\_N \in S. \exists z\_1, \dots, z\_N \in \mathbb{N}. \end{split} \begin{split} \forall z = 0 | \lor \\ \qquad \left( \begin{aligned} \mathsf{succ}\_a(x; y\_1, \dots, y\_N) \land \bigwedge\_{1 \le i \le N} \delta\_a(x, y\_i, z\_i) \land \sum\_{1 \le i \le N} z\_i = w\_a \right), \end{split} \end{split}$$

which is a first-order formula and is algorithmic over U by the fact that summation of a fixed number of weights is regular (as shown earlier in this section). Finally, since all of the wa's are expected to be the same common multiple of the denominators of the transition probabilities, we need to check that there is <sup>w</sup> <sup>∈</sup> <sup>N</sup> such that <sup>w</sup><sup>a</sup> <sup>=</sup> <sup>w</sup> for all a ∈ ACT. This is again algorithmic as we can pinpoint the exact value of each w<sup>a</sup> by enumeration.

#### **5 Application to Anonymity Verification**

In this section, we show how to verify the anonymity property of cryptographic protocols via computation of probabilistic bisimulations. We shall first formalize the connection between the concepts of anonymity and probabilistic bisimulation. We then introduce a verification framework and apply it to verify the anonymity property of the dining cryptographers protocol [16] and the grades protocol [29].

A *(discrete time) Markov chain* (a.k.a. *DTMC*) is a structure M := S; δ;L where S is a set of configurations, δ : S → D<sup>S</sup> is a family of probability distributions, and L : S → ACT is a labelling of the states. We shall use δ(s, s ) to denote δ(s)(s ), the transition probability from s to s . A sequence s<sup>0</sup> ...s<sup>n</sup> ∈ S<sup>∗</sup> is called a *path* of M if δ(si, si+1) > 0 for i ∈ {0,...,n − 1}. The probability distribution induced by the paths in a DTMC can be defined using a standard cylinder construction (see e.g. [33]) as follows. Given a finite path π := s<sup>0</sup> ··· s<sup>n</sup> ∈ S∗, we set Run<sup>π</sup> to be a *basic cylinder*, which is the set of all finite/infinite paths with π as a prefix. We associate this cylinder with probability Pr<sup>s</sup><sup>0</sup> (Runπ) = n−1 <sup>i</sup>=0 δ(si, si+1). This gives rise to a unique probability measure for the σ-algebra over the set of all paths from s0.

Given a PTS S := S; {δa}<sup>a</sup>∈ACT, an *adversary* f : S<sup>∗</sup> → ACT resolves the non-determinacy of S and induces a DTMC S<sup>f</sup> := S ; δ ;L . Here S := S<sup>∗</sup> ∪ {\$} contains all finite paths of S plus a "sink state" \$ such that δ (π) := I\$ <sup>2</sup> if and only if either π = \$, or δf(π) is the zero function. We define δ (π) := δf(π) otherwise. The labelling of S<sup>f</sup> is defined as L (\$) := ⊥ and L (π) := f(π) for π ∈ S∗.

Given a DTMC S; δ;L, the *trace* of a path π := s<sup>0</sup> ··· s<sup>n</sup> ∈ S<sup>∗</sup> is defined as τ (π) := L(s0)···L(sn). A *trace event* T is a set of finite traces; the probability of T with respect to a configuration s is specified with Pr<sup>s</sup> (<sup>T</sup> ) := Pr<sup>s</sup> ( {Run<sup>π</sup> : τ (π) ∈ T , π starts from s}).

Now we are ready to define the concept of anonymity. Fix S := S; {δa}<sup>a</sup>∈ACT and a set I ⊆ S of initial configurations. We say S is *anonymous to an adversary* f if for all <sup>s</sup> ∈ I and trace event <sup>T</sup> , the value of Pr<sup>s</sup> (T ) in S<sup>f</sup> is solely determined by T . Intuitively, this means that the adversary cannot obtain any information about a specific initial configuration by experimenting on the system and observing the traces.

We shall only consider external adversaries in this paper. An adversary f : S<sup>∗</sup> → ACT is *external* if f(s<sup>0</sup> ··· sn) = f(s <sup>0</sup> ··· s <sup>n</sup>) when L(si) = L(s <sup>i</sup>) for i ∈ {0,...,n}. That is, an external adversary takes action solely based on the trace she has observed so far. We call a PTS *anonymous* if it is anonymous to any external adversary. The following result establishes a connection between the anonymity property and probabilistic bisimulations.

**Proposition 2.** *Let* S := S; {δa}<sup>a</sup>∈ACT *be a PTS and* f *be an external adversary for* <sup>S</sup>*. Then for all* u, v <sup>∈</sup> <sup>S</sup> *such that* <sup>u</sup> <sup>∼</sup> <sup>v</sup>*,* Pr<sup>u</sup>(<sup>T</sup> ) = Pr<sup>v</sup>(<sup>T</sup> ) *holds for any trace event* T *in* S<sup>f</sup> *. That is, configurations* u *and* v *induce the same trace distribution in* S<sup>f</sup> *.*

Based on Proposition 2, we propose a framework to verify the anonymity property of S as follows. We first specify a "reference system" S := S; {δ <sup>a</sup>}<sup>a</sup>∈ACT that has

<sup>2</sup> Recall that I*<sup>s</sup>* denotes the point distribution at s, namely I*s*(s)=1.

the same initial configurations and actions as those of S, except that the trace distribution of S <sup>f</sup> is independent of specific initial configurations for any adversary f. We then try to find a bisimulation relation R between S and the reference system S satisfying R ⊇ {(s, s ) ∈I×I : s = s }. When such a relation R is found, we can conclude that the trace distribution of S<sup>f</sup> is also independent of the initial configurations for any adversary f, and hence prove the anonymity property of S.

**The Dining Cryptographers Protocol.** Dining cryptographers protocol [16] is a multi-party computation algorithm aiming to securely compute the XOR of the secret bits held by the participants. More precisely, consider a ring of n ≥ 3 participants p0,...,p<sup>n</sup>−<sup>1</sup> such that each participant p<sup>i</sup> holds a secret bit xi. To compute x<sup>0</sup> ⊕···⊕ x<sup>n</sup>−<sup>1</sup> without revealing information about the values of x0,...,x<sup>n</sup>−<sup>1</sup>, the participants carry out a two-stage computation as follows: (i) Each two adjacent participants pi, pi+1 compute a random bit b<sup>i</sup> that is accessible only to them; (ii) Each participant p<sup>i</sup> announces the value a<sup>i</sup> := xi⊕bi⊕b<sup>i</sup>−<sup>1</sup> <sup>3</sup> to the other participants. Hence, every participant p<sup>i</sup> can observe the values of xi, bi, b<sup>i</sup>−<sup>1</sup> and a0,...,a<sup>n</sup>−<sup>1</sup>. It turns out that a<sup>0</sup> ⊕···⊕ a<sup>n</sup>−<sup>1</sup> = x<sup>0</sup> ⊕···⊕ x<sup>n</sup>−<sup>1</sup>, so all participants are able to compute the XOR of the secret bits after executing the protocol. Furthermore, the anonymity property of the protocol assures that any individual participant p<sup>i</sup> cannot infer the values of the other secret bits from the information she has observed during the execution of the protocol.

We model the protocol as a length-preserving regular PTS. The configurations of a ring of n participants are encoded as words of size n. The initial configurations are words w ∈ {0, 1}<sup>∗</sup> such that w[i] represents x<sup>i</sup> for i ∈ {0,..., |w| − 1}. The transition relation consists of six transitions: observer non-deterministically tossing head (via action head), observer non-deterministically tossing tail (via action tail), non-observer tossing head with probability 0.5 (via action toss), non-observer tossing tail with probability 0.5 (via action toss), participant announcing zero (via action zero), and participant announcing one (via action one). The outcomes of the tosses by the observer are visible (i.e. as actions head and tail), while the outcomes of the tosses by the other participants are hidden (i.e. as action toss). Each maximal trace from an initial configuration of size n consists of n successive tossing actions, followed by n successive announcing actions. Starting from an initial configuration w and for i ∈ {0,...,n − 1}, the i-th toss action updates the value of w[j] to w[j] ⊕ b<sup>i</sup> for j ∈ {i, i + 1}, where b<sup>i</sup> = 1 if a head is tossed and b<sup>i</sup> = 0 otherwise. Any configuration v reached after n tosses would satisfy v[i] = x<sup>i</sup> ⊕ b<sup>i</sup> ⊕ b<sup>i</sup>−<sup>1</sup> for i ∈ {0,...,n − 1}. The PTS then "prints out" the configuration by going through n announcement transitions via actions a0,...,a<sup>n</sup>−<sup>1</sup>, such that a<sup>i</sup> is one if v[i]=1 and a<sup>i</sup> is zero if v[i]=0.

We consider the case where the first participant of the protocol is the observer. The maximal traces of the PTS in this case are in form of t · t , where |t| = |t |, t ∈ {head,tail} toss<sup>∗</sup>{head,tail}, and t ∈ {zero, one}<sup>∗</sup>. For example, head toss tail one zero zero is a maximal trace starting from initial configuration 010. To prove anonymity, we define a reference system such that the initial configurations and the actions are the same as those of the original PTS, except that the announcements a0,...,a<sup>n</sup>−<sup>1</sup>

<sup>3</sup> All arithmetical operations on the subscripts are performed modulo n to take the ring structure into account.

encoded in the maximal traces from an initial configuration w are uniformly distributed over {(a0,...,an−<sup>1</sup>) : a<sup>0</sup> ⊕ ··· ⊕ an−<sup>1</sup> = w[0] ⊕ ··· ⊕ w[n − 1], a<sup>0</sup> = w[0] ⊕ b<sup>0</sup> ⊕ bn−<sup>1</sup>}. <sup>4</sup> In this way, the distribution of the announcements is independent of the initial configuration once the values of x<sup>0</sup> ⊕ ··· ⊕ xn−<sup>1</sup>, x0, b0, and bn−<sup>1</sup> (i.e. the information revealed to the first participant) are fixed. We then compute a probabilistic bisimulation between the original system and the reference system, establishing the anonymity property that the first participant cannot infer the secret bits of the other participants from the information she observes.

*A generalized Dining Cryptographers Protocol.* We have also considered a generalized dining cryptographers protocol where the secret messages x0,...,x<sup>n</sup>−<sup>1</sup> of the n participants are bit-vectors of the same size. Note that the set of the initial configurations is not regular when the size of the secret messages is parameterized. To construct a regular model, we allow a configuration to encode secret messages of different sizes, and devise the transition system such that an initial configuration w can finish the protocol (i.e. can have a trace containing all of the announcements a0,...,a<sup>n</sup>−<sup>1</sup>) if and only if the messages encoded in w have same size. The resulting PTS is a regular system; it over-approximates the PTS of the generalized dining cryptographers protocol in the sense that the anonymity property of the former implies that of the latter.

**The Grades Protocol.** The grades protocol [29] is a multi-party computation algorithm aiming to securely compute the sum of the secrets held by the participants. The setting of the protocol is pretty similar to that of the dining cryptographers: given n ≥ 3 and g ≥ 2, we have a ring of n participants p0,...,p<sup>n</sup>−<sup>1</sup> where each participant p<sup>i</sup> holds a secret x<sup>i</sup> ∈ {0,...,g − 1}. Note that both g and n are parameterized in this protocol. The goal of the participants is to compute the sum x<sup>0</sup> + ··· + x<sup>n</sup>−<sup>1</sup> without revealing information about the individual secrets. Define M := (g − 1) · n + 1. The protocol consists of two steps: (i) Each two adjacent participants pi, pi+1 compute a random number y<sup>i</sup> ∈ {0,...,M − 1}; (ii) Each participant p<sup>i</sup> announces a<sup>i</sup> := (x<sup>i</sup> + y<sup>i</sup> − y<sup>i</sup>−<sup>1</sup>) mod M to the other participants. After executing the protocol, the participants compute a := a<sup>0</sup> + ··· + a<sup>n</sup>−<sup>1</sup> mod M. Because of the ring structure, the yi's will be cancelled out in the sum. Thus the value of a will equal to the sum of all secrets. The anonymity property of the protocol asserts that no participant can infer the secrets held by the other participants from the information she has observed.

We consider a variant of the grades protocol where M can be any power of two greater than (g − 1) · n. Observe that the same anonymity and correctness property of the original protocol also holds for this variant. To verify the anonymity property, we model an over-approximation of the protocol where the secrets are allowed to range over {0,...,M − 1}. This model is similar to the one we have constructed for the generalized dining cryptographers protocol except that, e.g., the XOR operations are now replaced with bitwise additions and negations. A reference system is specified such that the announcements a1,...,a<sup>n</sup>−<sup>1</sup> observed by the first participant p<sup>0</sup> are uniformly distributed over the values satisfying a<sup>0</sup> +···+a<sup>n</sup>−<sup>1</sup> mod M = x<sup>0</sup> +···+x<sup>n</sup>−<sup>1</sup> mod M.

<sup>4</sup> Such a distribution can be obtained by (i) choose <sup>a</sup>1,...,a*<sup>n</sup>*−<sup>2</sup> ∈ {0, <sup>1</sup>} uniformly at random; (ii) set a<sup>0</sup> = w[0] ⊕ b<sup>0</sup> ⊕ b*<sup>n</sup>*−<sup>1</sup>; (iii) set a*<sup>n</sup>*−<sup>1</sup> = a<sup>0</sup> ⊕···⊕ a*<sup>n</sup>*−<sup>2</sup> ⊕ w[0] ⊕···⊕ w[n − 1].

**Algorithm 1.** Equivalence check for L<sup>∗</sup>

**Input:** Candidate automaton H over Σ × Σ, PTS S, and relation E ⊆ (Σ × Σ) ∗. **Result:** *NoSolution*(v, w) if there is no bisimulation R with E ⊆ R. *PositiveCEX* (v, w) if H should accept (v, w), but does not; *NegativeCEX* (v, w) if H accepts (v, w), but should not; *Correct* if H is a correct bisimulation for PTS S and E ⊆ L(H); **<sup>1</sup>** Check whether E ⊆ L(H), and whether S |= Φ(L(H)) using the Φ from (3); **<sup>2</sup> if** *there is a counterexample of minimal length* n **then <sup>3</sup>** Compute the greatest bisimulation R¯*<sup>n</sup>* restricted to configurations of length n; **<sup>4</sup> if** *there is* (<sup>v</sup> <sup>⊗</sup> <sup>w</sup>) <sup>∈</sup> <sup>E</sup> \ <sup>R</sup>¯*<sup>n</sup> with* <sup>|</sup>v<sup>|</sup> <sup>=</sup> <sup>|</sup>w<sup>|</sup> <sup>=</sup> <sup>n</sup> **then <sup>5</sup>** Output *NoSolution*(v, w) and abort; **<sup>6</sup> else if** *there is* (<sup>v</sup> <sup>⊗</sup> <sup>w</sup>) ∈ L(H) \ <sup>R</sup>¯*<sup>n</sup> with* <sup>|</sup>v<sup>|</sup> <sup>=</sup> <sup>|</sup>w<sup>|</sup> <sup>=</sup> <sup>n</sup> **then <sup>7</sup> return** *NegativeCEX* (v, w); **<sup>8</sup> else if** *there is* (<sup>v</sup> <sup>⊗</sup> <sup>w</sup>) <sup>∈</sup> <sup>R</sup>¯*<sup>n</sup>* \ L(H) **then <sup>9</sup> return** *PositiveCEX* (v, w); **<sup>10</sup> else <sup>11</sup> return** *Correct*;

By computing a probabilistic bisimulation between the original system and the reference system, we establish the anonymity property that the grades protocol is anonymous whenever M is chosen as a power of two with M ≥ (g − 1) · n + 1.

#### **6 Learning Probabilistic Bisimulations**

We propose an automata learning method to automatically compute regular probabilistic bisimulations R, focusing on the case of *length-preserving* PTSs, which covers all examples given in the previous section. The approach uses active automata learning, for instance Angluin's L<sup>∗</sup> method [5] or refinements of it, to compute R. This approach is inspired by previous work on using active automata learning for invariant inference [18,54]. Our procedure assumes (i) as input a bounded-branching PTS S = S; {δa}<sup>a</sup>∈ACT, as well as a length-preserving regular relation E ⊆ (Σ × Σ)<sup>∗</sup> supposed to be covered by R; (ii) an effective way to check the correctness of R, i.e., a decision procedure in the sense of Theorem 1; and (iii) a procedure to compute the greatest probabilistic bisimulation <sup>R</sup>¯<sup>n</sup> <sup>⊆</sup> (<sup>Σ</sup> <sup>×</sup> <sup>Σ</sup>)<sup>n</sup> for <sup>S</sup> restricted to configurations of any length <sup>n</sup> <sup>∈</sup> <sup>N</sup>. The last assumption can easily be satisfied for length-preserving PTSs. Indeed, such systems, restricted to configurations of length n, are finite-state, so that efficient existing methods [6,17,20,52] apply. A solution R is presented as a deterministic letter-to-letter transducer, i.e., as a deterministic finite-state automaton over the alphabet Σ × Σ.

Since L∗-style learning requires the taught language to be uniquely defined, our approach attempts to learn a representation of the greatest *length-preserving* probabilistic bisimulation relation <sup>R</sup>¯ <sup>⊆</sup> (<sup>Σ</sup> <sup>×</sup> <sup>Σ</sup>)∗, which is the unique bisimulation relation formed by the union of all length-preserving probabilistic bisimulations of S, i.e., R¯ = <sup>n</sup>≥<sup>1</sup> <sup>R</sup>¯n. Because <sup>R</sup>¯ is not in general computable, the learning process might diverge and fail to produce any probabilistic bisimulation. It can also happen that learning terminates, but yields a probabilistic bisimulation relation strictly smaller than R¯.

The L<sup>∗</sup> method requires a teacher that is able to answer two kinds of queries:


*Properties of the Learning Algorithm.* The learning procedure terminates when the teacher outputs *NoSolution* or returns *Correct* for an equivalence query. In the former case, the teacher explicitly provides a pair of non-bisimilar configurations in E. In the latter case, the procedure computes an automaton H such that E ⊆ L(H) and L(H) is a correct probabilistic bisimulation (as it satisfies the proof rule based on Theorem 1), though not necessarily the greatest one. Since all counterexamples reported by the teacher are contained in <sup>L</sup>(H) <sup>Δ</sup> <sup>R</sup>¯, the learning procedure is guaranteed to terminate for PTSs where the greatest probabilistic bisimulation R¯ is regular.

*Optimization with Inductive Invariants.* There is a natural way to optimize the learning procedure by only considering a *regular* inductive invariant *Inv* such that *Inv* contains the set of reachable configurations and E ⊆ *Inv* × *Inv*. The optimization is done by simply replacing the greatest finite-length bisimulations R¯<sup>i</sup> in Algorithm 1, and when answering membership queries, with the greatest bisimulation R¯<sup>I</sup> <sup>i</sup> <sup>=</sup> <sup>R</sup>¯<sup>i</sup> <sup>∩</sup> *Inv* on the inductive invariant. Since R¯<sup>I</sup> <sup>i</sup> can be a lot smaller than <sup>R</sup>¯i, this can lead to significant speed-ups. Note that a bisimulation R on *Inv* can be extended to a bisimulation R on all configurations by setting R = R ∪ {(v, v) : v ∈ *Inv*}. The inductive invariant *Inv* may be manually specified, or automatically generated using techniques like in [18,54].

*Experimental Results and Conclusion.* We have implemented a prototype in Scala to test our learning method. Given a PTS specified over U , our tool first translates it to WS1S formulas and obtains finite automata for these formulas using the Mona tool [30]. Our prototype then applies the L<sup>∗</sup> learning procedure as described in this section, including the optimization to consider only the configurations of valid format. When answering an equivalence query, our tool invokes Mona to verify candidate automata and obtain counterexamples (line 1–2 of Algorithm 1). We use the prototype tool to prove the anonymity property of the three protocols described in Sect. 5. The proofs **Table 1.** Experimental results. For each case study, we list the size of the final proof produced by our tool, the time taken by Mona to verify the candidate automata, the time taken by our tool to compute the fixed-length bisimulations, and the total computation time of the learning procedure. Experiments are run on a Windows laptop with 2.4 GHz Intel i5 processor and 2 GB memory limit.


generated by our tool are finite-state automata encoding the desired probabilistic bisimulation relations. The experimental results are summarized in Table 1.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Semi-quantitative Abstraction and Analysis of Chemical Reaction Networks**

Milan Ceˇ <sup>ˇ</sup> ska1(B) and Jan Kˇret´ınsk´y<sup>2</sup>

<sup>1</sup> Brno University of Technology, FIT, IT4I Centre of Excellence, Brno, Czech Republic ceskam@fit.vutbr.cz <sup>2</sup> Technical University of Munich, Munich, Germany

**Abstract.** Analysis of large continuous-time stochastic systems is a computationally intensive task. In this work we focus on population models arising from chemical reaction networks (CRNs), which play a fundamental role in analysis and design of biochemical systems. Many relevant CRNs are particularly challenging for existing techniques due to complex dynamics including stochasticity, stiffness or multimodal population distributions. We propose a novel approach allowing not only to predict, but also to explain both the transient and steady-state behaviour. It focuses on qualitative description of the behaviour and aims at quantitative precision only in orders of magnitude. First we build a compact understandable model, which we then crudely analyse. As demonstrated on complex CRNs from literature, our approach reproduces the known results, but in contrast to the state-of-the-art methods, it runs with virtually no computational cost and thus offers unprecedented scalability.

#### **1 Introduction**

Chemical Reaction Networks (CRNs) are a versatile language widely used for *modelling and analysis* of biochemical systems [12] as well as for high-level *programming* of molecular devices [8,40]. They provide a compact formalism equivalent to Petri nets [37], Vector Addition Systems (VAS) [29] and distributed population protocols [3]. Motivated by numerous potential applications ranging from system biology to synthetic biology, various techniques allowing simulation and formal analysis of CRNs have been proposed [2,9,21,24,39], and embodied in the design process of biochemical systems [20,25,32]. The time-evolution of CRNs is governed by the Chemical Master Equation (CME), which describes the probability of the molecular counts of each chemical species. Many important biochemical systems lead to complex dynamics that includes *state space explosion, stochasticity, stiffness, and multimodality* of the population distributions

This work has been supported by the Czech Science Foundation grant No. GA19- 24397S, the IT4Innovations excellence in science project No. LQ1602, and the German Research Foundation (DFG) project KR 4890/2-1 "Statistical Unbounded Verification".

[23,44], and that fundamentally limits the class of systems the existing techniques can effectively handle. More importantly, biologist and engineers often seek for plausible explanations why the system under study has or has not the required behaviour. In many cases, a set of system simulations/trajectories or population distributions is not sufficient and the ability to provide an accurate explanation for the temporal or steady-state behaviour is another major challenge for the existing techniques.

In order to cope with the computational complexity of the analysis and in order to obtain explanations of the behaviour, we shift the focus from quantitatively precise results to a more qualitative analysis, closer to how a human would behold the system. Yet we insist on providing at least rough timing information on the behaviour as well as rough classification of probability of different behaviours at the extent of "very likely", "few percent", "barely possible", so that we can conclude on issues such as time to extinction or bimodality of behaviour. This gives rise to our *semi-quantitative* approach. We stipulate that analyses in this framework reflect quantities in orders of magnitude, both for time duration and probabilities, but not more than that. This paradigm shift is reflected on two levels: (1) We abstract systems into semi-quantitative models. (2) We analyse systems in a semi-quantitative way. While each of the two can be combined with a traditional abstraction/analysis, when combined together they provide powerful means to understand systems' behaviour with virtually no computational cost.

**Semi-quantitative Models.** The states of the models contain information on the current amount of objects of each species as an interval spanning often several orders of magnitude, unless instructed otherwise. For instance, if an amount of a certain species is to be closely monitored (as a part of the input specification/property of the system) then this abstraction can be finer. Similarly, whenever the analysis of a previous version of the abstraction points to the lack of precision in certain states, preventing us to conclude which of the possible behaviours is prevalent, the corresponding refinement can take place. Further, the rates of the transitions are also captured only with such imprecision. The crucial point allowing for existence of such models that are small, yet faithful, is our concept of *acceleration*. It captures certain *sequences* of transitions. It eliminates most of the non-determinism that paralyses other types of abstractions, which are too over-approximative, unable to conclude anything, but safety properties.

**Semi-quantitative Analysis.** Instead of performing exact transient or steadystate analysis, we can consider most probable transitions and then carefully lift this to most probable temporal behaviours. Technically, this is done by *alternating between transient and steady-state analysis* where only some rates and transitions are taken into account at different stages. In order to further facilitate the resulting insight of the human on the result of the analysis, we provide an algorithm to perform this analysis with virtually no computation effort and thus possibly manually. The trivial computations immediately pinpoint why certain behaviours occur. Moreover, less likely behaviours can also be identified easily, to any desired degree of improbability (dozens of percent, promilles etc.).

To summarise, the first step yields tiny models, allowing for a synoptic observation of the model; due to their size these models can be either analysed easily using standard means, or can be subject to the second step. The second step provides an efficient approximative analysis, which is also very illustrative due to the limited use of quantities. It can be applied to any system; however, it is particularly interesting in connection with the models coming from the first step since (i) no extra effort (size, computation) is wasted on overly precise treatment that is ignored by the other step, and (ii) together they yield an understandable explanation of the behaviour. An entertaining feature of this paradigm is that the stiffer (with rates at hugely different time scales) the system is the easier it is to analyse.

To demonstrate the capabilities of our approach, we consider three challenging and biologically relevant case studies that have been used in literature to evaluate state-of-the-art methods for the CRN analysis. It has been shown that many approaches fail, either due to time-outs or incapability to capture differences in behaviours, and some tailored ones require considerable computational effort, e.g. an hour of computation. Our experiments clearly show that the proposed approach can deliver results that yield qualitatively same information, more understanding and can be computed in minutes by hand (or within a fraction of a second by computer).

**Our contribution** can be summarized as follows:


#### **Related Work**

To the best of our knowledge, there does not exist any abstraction of CRNs similar to the proposed approach. Indeed, there exist various abstraction and approximation schemes for CRNs that improve the performance and scalability of both the simulation-based and the numerical-based techniques. In the following paragraphs, we discuss the most relevant directions and the links to our approach.

**Approximate Semantics for CRNs.** For CRNs including large populations of species, fluid (mean-field) approximation techniques can be applied [5] and extended to approximate higher-order moments [15]: these deterministic approximations lead to a set of ordinary differential equations (ODEs). An alternative is to approximate the CME as a continuous-state stochastic process. The Linear Noise Approximation (LNA) is a Gaussian process which has been derived as an approximation of the CME [16,44] and describes the time evolution of expectation and variance of the species in terms of ODEs. Recently, an aggregation scheme over ODEs that aims at understanding the dynamics of large CRNs has been proposed in [10]. In contrast to our approach, the deterministic approximations cannot adequately capture the stochasticity of CRNs caused by low population species.

To mitigate this drawback, various *hybrid models* have been proposed. The common idea of these models is as follows: the dynamics of low population species is described by the discrete stochastic process and the dynamics of large population species is approximated by a continuous process. The particular hybrid models differ in the approximation of the large population species. In [27], a pure deterministic semantics for large population species is used. The moment-based description for medium/high-copy number species was used in [24]. The LNA approximation and an adaptive partitioning of the species according to leap conditions (that is more general than partitioning based on population thresholds) was proposed in [9]. All hybrid models have to deal with interactions between low and large population species. In particular, the dynamics of the stochastic process describing the low-population species is conditioned by the continuousstate describing the concentration of the large-population species. The numerical analysis of such conditioned stochastic process is typically a computationally demanding task that limits the scalability.

In contrast, our approach does not explicitly partition the species, but rather abstracts the concrete species population using an interval abstraction and tries to effectively capture both the stochastic and the deterministic behaviour with the help of the accelerated transitions. As we already emphasised, the proposed abstraction and analysis avoids any numerical computation of precise quantities.

**Reduction Techniques for Stochastic Models.** A widely studied reduction method for Markov models is state aggregation based on lumping [6] or (bi-)simulation equivalence [4], with the latter notion in its exact [33] or approximate [13] form. Approximate notions of equivalence have led to new abstraction/refinement techniques for the numerical verification of Markov models over finite [14] as well as uncountably-infinite state spaces [1,41,42]. Several approximate aggregation schemes leveraging the structural properties of CRNs were proposed [17,34,45]. Abate et al. proposed an adaptive aggregation that gives formal guarantees on the approximation error, but typically provide lower state space reductions [2]. Our approach shares the idea of abstracting the state space by aggregating some states together. Similarly to [17,34,45], we partition the state space based on the species population, i.e. we also introduce the population levels. In contrast to the aforementioned aggregation schemes, we propose a novel abstraction of the transition relation based on the acceleration. It allows us to avoid the numerical solution of the approximate CME and thus achieve a better reduction while providing an accurate predication of the system behaviour.

Alternative methods to deal with large/infinite state spaces are based on a state truncation trying to eliminate insignificant states, i.e., states reached only with a negligible probability. These methods, including finite state projections [36], sliding window abstractions [26], or fast adaptive uniformisation [35], are able to quantify the total probability mass that is lost due to the truncation, but typically cannot effectively handle systems involving a stiff behaviour and multimodality [9].

**Simulation-Based Analysis.** Transient analysis of CRNs can be performed using the Stochastic Simulation Algorithm (SSA) [21]. Note that the SSA produces a single realisation of the stochastic process, whereas the stochastic solution of CME gives the probability distribution of each species over time. Although simulation-based analysis is generally faster than direct solution of the stochastic process underlying the given CRN, obtaining good accuracy necessitates potentially large numbers of simulations and can be very time consuming.

Various partitioning schemes for species and reactions have been proposed for the purpose of speeding up the SSA in multi-scale systems [23,38,39]. For instance, Yao et al. introduced the slow-scale SSA [7], where they distinguish between fast and slow species. Fast species are then treated assuming they reach equilibrium much faster than the slow ones. Adaptive partitioning of the species has been considered in [19,28]. In contrast to the simulation-based analysis, our approach (i) provides a compact explanation of the system behaviour in the form of tiny models allowing for a synoptic observation and (ii) can easily reveal less probable behaviours.

#### **2 Chemical Reaction Networks**

In this paper, we assume familiarity with standard verification of (continuoustime) probabilistic systems, e.g. [4]. For more detail, see [11, Appendix].

*CRN Syntax.* A *chemical reaction network (CRN)* N = (Λ, R) is a pair of finite sets, where Λ is a set of *species*, |Λ| denotes its size, and R is a set of reactions. Species in Λ interact according to the reactions in R. A *reaction* τ ∈ R is a triple <sup>τ</sup> = (r<sup>τ</sup> , p<sup>τ</sup> , k<sup>τ</sup> ), where <sup>r</sup><sup>τ</sup> <sup>∈</sup> <sup>N</sup>|Λ<sup>|</sup> is the *reactant complex*, <sup>p</sup><sup>τ</sup> <sup>∈</sup> <sup>N</sup>|Λ<sup>|</sup> is the *product complex* and <sup>k</sup><sup>τ</sup> <sup>∈</sup> <sup>R</sup><sup>&</sup>gt;<sup>0</sup> is the coefficient associated with the rate of the reaction. r<sup>τ</sup> and p<sup>τ</sup> represent the stoichiometry of reactants and products. Given a reaction τ<sup>1</sup> = ([1, 1, 0], [0, 0, 2], k1), we often refer to it as τ<sup>1</sup> : λ<sup>1</sup> + λ<sup>2</sup> <sup>k</sup><sup>1</sup> −→ <sup>2</sup>λ3.

*CRN Semantics.* Under the usual assumption of mass action kinetics, the *stochastic* semantics of a CRN N is generally given in terms of a discrete-state, continuous-time stochastic process **X**(**t**)=(X1(t), X2(t),...,X|Λ|(t), t ≥ 0) [16]. The *state change* associated to the reaction τ is defined by υ<sup>τ</sup> = p<sup>τ</sup> − r<sup>τ</sup> , i.e. the state **<sup>X</sup>** is changed to **<sup>X</sup>** <sup>=</sup> **<sup>X</sup>** <sup>+</sup> <sup>υ</sup><sup>τ</sup> , which we denote as **<sup>X</sup>** <sup>τ</sup> −→ **X** . For example, for τ<sup>1</sup> as above, we have υτ<sup>1</sup> = [−1, −1, 2]. For a reaction to happen in a state **X**, all reactants have to be in sufficient numbers. The *reachable state space* of **X**(**t**), denoted as **S**, is the set of all states reachable by a sequence of reactions from a given *initial state* **X**0. The set of reactions changing the state **X**<sup>i</sup> to the state **X**<sup>j</sup> is denoted as reac(**X**i, **X**<sup>j</sup> ) = {τ | **X**<sup>i</sup> τ −→ **X**j}.

The behaviour of the stochastic system **X**(**t**) can be described by the (possibly infinite) continuous-time Markov chain (CTMC) γ(N )=(**S**, **X**0, **R**) where the transition matrix **R**(i, j) gives the probability of a transition from **X**<sup>i</sup> to **X**<sup>j</sup> . Formally,

$$\mathbf{R}(i,j) = \sum\_{\tau \in \text{reac}(\mathbf{X}\_i, \mathbf{X}\_j)} k\_\tau \cdot C\_{\tau, i} \quad \text{where} \quad C\_{\tau, i} = \prod\_{\ell=1}^N \binom{\mathbf{X}\_{i, \ell}}{r\_\ell} \tag{\text{R}}$$

corresponds to the population dependent term of the *propensity function* where **X**i, is th component of the state **X**<sup>i</sup> and r is the stoichiometric coefficient of the -th reactant in the reaction τ . The CTMC γ(N ) is the accurate representation of CRN N , but—even when finite—not scalable in practice because of the state space explosion problem [25,31].

#### **3 Semi-quantitative Abstraction**

In this section, we describe our abstraction. We derive the desired CTMC conceptually in several steps, which we describe explicitly, although we implement the construction of the final system directly from the initial CRN.

#### **3.1 Over-Approximation by Interval Abstraction and Acceleration**

Given a CRN N = (Λ, R), we first consider an interval continuous-time Markov decision process (interval CTMDP<sup>1</sup>), which is a finite abstraction of the infinite γ(N ). Intuitively, abstract states are given by intervals on sizes of populations with an additional specific that the abstraction captures enabledness of reactions. The transition structure follows the ideas of the standard may abstraction and of the three-valued abstraction of continuous-time systems [30]. A technical difference in the latter point is that we abstract rates into intervals instead of uniformising the chain and then only abstracting transition probabilities into intervals; this is necessary in later stages of the process. The main difference is that we also treat certain sequences of actions, which we call acceleration.

**Abstract Domains.** The first step is to define the abstract domain for the population sizes. For every species λ ∈ Λ, we define a finite partitioning A<sup>λ</sup> of N into intervals, reflecting the rough size of the population. Moreover, we want the abstraction to reflect whether a reaction is enabled. Hence we require that

<sup>1</sup> Interval CTMDP is a CTMDP with lower/upper bounds on rates. Since it serves only as an intermediate formalism to ease the presentation, we refrain from formalising it here.

{0} ∈ A<sup>λ</sup> for the case when the coefficients of this species as a reactant is always 0 or 1; in general, for every i < maxτ∈R r<sup>τ</sup> (λ) we require {i} ∈ Aλ.

The abstraction αλ(n) of a number n of a species λ is then the I ∈ A<sup>λ</sup> for which n ∈ I. The state space of α(N ) is the product <sup>λ</sup>∈<sup>Λ</sup> <sup>A</sup><sup>λ</sup> of the abstract domains with the point-wise defined abstraction α(*n*)<sup>λ</sup> = αλ(*n*λ).

The abstract domain for the rates according to (R) is the set of all real intervals.

Transitions from an abstract state are defined as the may abstraction as follows. Since our abstraction reflect enabledness, the same set of action is enabled in all concrete states of a given abstract state. The targets of the action in the abstract setting are abstractions of all possible concrete successors, i.e. *succ*(s, a) := {α(*n*) <sup>|</sup> *<sup>m</sup>* <sup>∈</sup> s,*<sup>m</sup>* <sup>a</sup> −→ *n*}, in other words, the transitions enabled in at least one of the respective concrete states. The abstract rate is the smallest interval including all the concrete rates of the respective concrete transitions. This can be easily computed by the corner-points abstraction (evaluating only the extremum values for each species) since the stoichiometry of the rates is monotone in the population sizes.

**High-Level of Non-determinism.** The (more or less) standard style of the abstraction above has several drawbacks—mostly related to the high degree of non-determinism for rates—which we will subsequently discuss.

Firstly, in connection with the abstract population sizes, transitions to different sizes only happen non-deterministically, leaving us unable to determine which behaviour is probable. For example, consider the simple system given by <sup>λ</sup> <sup>d</sup> −→ ∅ with <sup>k</sup><sup>d</sup> = 10−<sup>4</sup> so the degradation happens on average each 10<sup>4</sup> seconds. Assume population discretisation into [0], [1..5], [6..20], [21..∞) with abstraction depicted in Fig. 1. While the original system obviously moves from [6..20] to [1..5] very probably in less than 15·10<sup>4</sup> seconds, the abstraction cannot even say that it happens, not to speak of estimating the time.

$$\begin{aligned} \begin{array}{l} \begin{array}{l} \begin{array}{l} \begin{array}{l} d,10^{4} \end{array} \end{array} \end{array} \star \begin{array}{l} d,6 \cdot 10^{4} \end{array} \star \begin{array}{l} d,6 \cdot 10^{4} \end{array} \end{array} \star \begin{array}{l} \begin{array}{l} d,21 \cdot 10^{4} \\ \hline \end{array} \end{array} \star \begin{array}{l} d,21 \cdot 10^{4} \end{array} \star \begin{array}{l} d,21 \cdot 10^{4} \\ \hline \end{array} \end{array} \star \begin{array}{l} \begin{array}{l} \begin{array}{l} d,21 \cdot 10^{4} \\ \hline \end{array} \end{array} \end{aligned} \begin{array}{l} d,21 \cdot 10^{4} \\ \hline \end{array} \end{aligned} \begin{array}{l} \begin{array}{l} d,21 \cdot 10^{4} \\ \hline \end{array} \end{array} \begin{array}{l} d,21 \cdot 10^{4} \\ \hline \end{array} \end{aligned}$$

**Fig. 1.** Above: Interval CTMDP abstraction with intervals on rates and nondeterminism. Below: Interval CTMC abstraction arising from acceleration.

**Acceleration.** To address this issue, we drop the non-deterministic self-loops and transitions to higher/lower populations in the abstract system.<sup>2</sup> Instead,

<sup>2</sup> One can also preserve the non-determinism for the special case when one of the transitions leads to a state where some action ceases to be enabled. While this adds more precision, the non-determinism in the abstraction makes it less convenient to handle.

we *"accelerate"* their effect: We consider sequences of these actions that in the concrete system have the effect of changing the population level. In our example above, we need to take the transition 1 to 13 times from [6..20] with various rates depending on the current concrete population, in order to get to [1..5]. This makes the precise timing more complicated to compute. Nevertheless, the expected time can be approximated easily: here it ranges from <sup>1</sup> <sup>6</sup> ·10<sup>4</sup> = 0.17·10<sup>4</sup> (for population 6) to roughly ( <sup>1</sup> <sup>20</sup> <sup>+</sup> <sup>1</sup> <sup>19</sup> <sup>+</sup>···<sup>+</sup> <sup>1</sup> <sup>6</sup> )·10<sup>4</sup> = 1.3·10<sup>4</sup> (for population 20). This results in an interval CTMC.<sup>3</sup>

**Concurrency in Acceleration.** The accelerated transitions can due to higher number of occurrences be considered continuous or deterministic, as opposed to discrete stochastic changes as distinguished in the hybrid approach. The usual differential equation approach would also take into account other reactions that are modelled deterministically and would combine their effect into one equation. In order to simplify the exposition and computation and—as we see later without much loss of precision, we can consider only the fastest change (or non-deterministically more of them if their rates are similar).<sup>4</sup>

#### **3.2 Operational Semantics: Concretisation to a Representative**

The next disadvantage of classical abstraction philosophy, manifested in the interval CTMC above is that the precise-valued intervals on rates imply high computational effort during the analysis. Although the system is smaller, standard transient analysis is still quite expensive.

**Concretisation.** In order to deal with this issue, the interval can be approximated roughly by the expected time it would take for an average population in the considered range, in our example the "average" representative is 13. Then the first transition occurs with rate 13 · <sup>10</sup>−<sup>4</sup> = 10−<sup>3</sup> and needs to happen 7 times, yielding expected time 7/<sup>13</sup> · <sup>10</sup><sup>4</sup> = 0.<sup>5</sup> · <sup>10</sup><sup>4</sup> (ignoring even the precise slow downs in the rates as the population decreases). Already this very rough computation yields relative precision with factor 3 for all the populations in this interval, thus yielding the correct order of magnitude with virtually no effort. We lift the concretisation naturally to states and denote the concretisation of abstract state s by γ(s). The complete procedure is depicted in Algorithm 1.

The concretisation is one of the main points where we deliberately drop a lot of quantitative information, while still preserving some to conclude on big quantitative differences. Of course, the precision improves with more precise abstract domains and also with higher differences on the original rates.

<sup>3</sup> The waiting times are not distributed according to the rates in the intervals. It is only the expected waiting time (reciprocal of the rate) that is preserved. Nevertheless, for ease of exposition, instead of labelling the transitions with expected waiting times we stick to the CTMC style with the reciprocals and formally treat it as if the label was a real rate.

<sup>4</sup> Typically the classical concurrency diamond appears and the effect of the other accelerated reactions happen just after the first one.



It remains to determine the representative for the unbounded interval. In order to avoid infinity, we require an additional input for the analysis, which are deemed upper bounds on possible population of each species. In cases when any upper bound is hard to assume, we can analyse the system with a random one and see if the last interval is reachable with significant probability. If yes, then we need to use this upper bound as a new point in the interval partitioning and try a higher upper bound next time. In general, such conditions can be checked in the abstraction and their violation implies a recommendation to refine the abstract domains accordingly.

**Orders-of-Magnitude Abstraction.** Such an approximation is thus sufficient to determine most of the time whether the acceleration (sequence of actions) happens sooner or later than e.g. another reaction with rate 10−<sup>6</sup> or 10−<sup>2</sup>. Note that this *decision* gets more precise not only as we refine the population levels, but also as the system gets stiffer (the concrete values of the rates differ more), which are normally harder to analyse. For the ease of presentation in our case studies, we shall depict only the magnitude of the rates, i.e. the decadic logarithm rounded to an integer.

**Non-determinism and Refinement.** If two rates are close to each other, say of the same magnitude (or difference 1), such a rough computation (and rough population discretisation) is not precise enough to determine which of the reactions happens with high probability sooner. Both may be happening roughly at the same pace, or with more information we could conclude one of them is considerably faster. This introduces an uncertainty, showing different behaviours are possible depending on the exact quantities. This indicates points where refinement might be needed if more precise results are required. For instance, with rates of magnitudes 2 and 3, the latter should be happing most of the time, the former only with a few percent chance. If we want to know whether it is rather tens of percent or tenths of percent, we should refine the abstraction.

#### **4 Semi-quantitative Analysis**

In this section, we present an approximative analysis technique that describes the most probable transient and steady-state behaviour of the system (also with rough timing) and on demand also the (one or more orders of magnitude) less probable behaviours. As such it is robust in the sense that it is well suited to work with imprecise rates and populations. It is computationally easy (can be done in hand in time required for a computer by other methods), while still yielding significant quantitative results ("in orders of magnitude"). It does not provide exact error guarantees since computing them would be almost as expensive as the classical analysis. It only features trivial limit-style bounds: if the population abstraction gets more and more refined, the probabilities converge to those of the original system; further, the higher the separation between the rate magnitudes, the more precise the approximation is since the other factors (and thus the incurred imprecisions) play less significant role.

Intuitively, the main idea—similar to some multi-rate simulation techniques for stiff systems—is to "simulate" "fast" reactions until the steady state and then examine which slower reactions take place. However, "fast" does not mean faster than some constant, but faster than other transitions in a given state. In other words, we are not distinguishing fast and slow reactions, but tailor this to each state separately. Further, "simulation" is not really a stochastic simulation, but a deterministic choice of the fastest available transition. If a transition is significantly faster than others then this yields what a simulation would yield. When there are transitions with similar rates, e.g. with at most one order of magnitude difference, then both are taken into account as described in the following definition.

**Pruned System.** Consider the underlying graph of the given CTMC. If we keep only the outgoing transitions with the maximum rate in each state, we call the result *pruned*. If there is always (at most) one transition then the graph consists of several paths leading to cycles. In general when more transitions are kept, it has bottom strongly connected components (bottom SCCs, BSCCs) and some transient parts.

We generalise this concept to n-*pruning* that preserves all transitions with a rate that is not more than n orders of magnitude smaller than the maximum rate in the state. Then the pruning above is 0-pruning, 1-pruning preserves also transitions happening up to 10 times slower, which can thus still happen with dozens of percent, 2-pruning is relevant for analysis where behaviour occurring with units of percent is also tracked etc.

**Algorithm Idea.** Here we explain the idea of Algorithm 2. The transient parts of the pruned system describe the most probable behaviour from each state until the point where visited states start to repeat a lot (steady state of the pruned system). In the original system, the usual behaviour is then to stay in this SCC C until one of the pruned (slower) reactions occurs, say from state s to state t. This may bring us to a different component of the pruned graph and the analysis process repeats. However, t may also bring us back into C, in which case we stay in the steady-state, which is basically the same as without the transition from s to t. Further, t might be in the transient part leading to C, in which case these states are added to C and the steady state changes a bit, spreading the distribution slightly also to the previously transient states. Finally, t might be leading us into a component D where this run was previous to visiting C. In that case, the steady-state distribution spreads over all the components visited between D and C, putting a probability mass to each with a different order of magnitude depending on all the (magnitudes of) sojourn times in the transient and steady-state phases on the way.

Using the macros defined in the algorithm, the correctness of the computations can be shown as follows. For the time spent in the transient phase (line 16), we consider the slowest sojourn time on the way times the number of such transitions; this is accurate since the other times are by order(s) of magnitude shorter, hence negligible. The steady-state distribution on a BSCC of the



*exitingRate*(s) is the maximum rate of transitions from s not in the pruned graph

pruned graph can be approximated by the *minStayingRate*/(m · *stayingRate*(·)) on line 5. Indeed, it corresponds to the steady-state distribution if the BSCC is a cycle and the *minStayingRate* significantly larger than other rates in the BSCC since then the return time for the states is approximately m/*minStayingRate* and the sojourn time 1/*stayingRate*(·). The component is exited from s with the proportion given by its steady-state distribution times the probability to take the exit during that time. The former is approximated above; the latter can be approximated by the density in 0, i.e. by *exitingRate*(s), since the staying rate is significantly faster. Hence the candidates for exiting are maximising *exitingRate*(·)/*stayingRate*(·) as on line 7. There are |*exitStates*| candidates for exit and the time to exit the component by a particular candidate s is the expected number of visits before exit, i.e. *stayingRate*(s) · *exitingRate*(s) times the return time m · *minStayingRate*, hence the expression on line 9.

**Fig. 2.** Alternating transient and steady-state analysis.

For example, consider the system in Fig. 2. Iteration 1 reveals the part with solid lines with two (temporary) BSCCs {t} and {s1, s2, s3}. The former turns out definitely bottom. The latter has a steady state proportional to (10−1, 10−1, 100−<sup>1</sup>). Its most probable exits are the dashed ones, identified in the subsequent iteration 2, probable proportionally to (1/10,10/100); the expected time to take them is 10 · 2/(2 · 10 · 1) = 1 = 100 · 2/(2 · 10 · 10). The latter leads back to the current SCC and does not change the set of BSCCs (hence in our examples below we often either skip or merge such iterations for the sake of readability). In contrast, the former leads to a previous SCC; thereafter {s1, s2, s3} is no more a bottom SCC and consequently the third exit to u is not even analysed. Nevertheless, it could still happen with minor probability, which can be seen if we consider 1-pruning instead.

#### **5 Experimental Evaluation and Discussion**

In order to demonstrate the applicability and accuracy of our approach, we selected the following three biologically relevant case studies. (1) stochastic model of gene expression [22,24], (2) Goutsias's model [23] describing transcription regulation of a repressor protein in bacteriophage λ and (3) viral infection model [43].

Although the underlying CRNs are quite small (up to 5 species and 10 reaction), their analysis is very challenging: (i) the stochasticity has a strong impact on the dynamics of these systems and thus purely deterministic approximations via ODEs are not accurate, (ii) the systems include species with low, medium, and high populations and thus the resulting state space of the stochastic process is prohibitively large to perform precise numerical analysis and existing reduction/approximation techniques are not sufficient (they are either too imprecise or do not provide sufficient reduction factors), and (iii) the system dynamics leads to bi-modal distributions and/or is affected by stiff reactions.

These models thus represent perfect candidates for evaluating advanced approximation methods including various hybrid approaches [9,24,27]. Although these approaches can handle the models, they typically require tens of minutes or hours of computation time. Similarly simulation-based methods are very time consuming especially in case of very stiff CRN, represented by the viral infection model. We demonstrate that our approach provides accurate predications of the system behaviour and is feasible even when performed manually by a human.

Recall that the algorithm that builds the abstract model of the given CRN takes as input two vectors representing the population discretisation and population bounds. We generally assume that these inputs are provided by users who have a priori knowledge about the system (e.g. in which orders the species population occurs) and that the inputs also reflect the level of details the users are interested in. In the following case studies, we, however, set the inputs only based on the rate orders of the reactions affecting the particular species (unless mentioned otherwise).

#### **5.1 Gene Expression Model**

The CRN underlying the gene expression model is described in Table 1. As discussed in [24] and experimentally observed in [18], the system oscillates between two phases characterised by the Don state and the Doff state, respectively. Biologists are interested in how the distribution of the Don and Doff states is aligned with the distribution of RNA and proteins P, and how the correlation among the distributions depends on the DNA switching rates.

The state vector of the underlying CTMC is given as [P, RNA, Doff, Don]. We use very relaxed bounds on the maximal populations, namely the bound 1000 for P and 100 for RNA. Note the DNA invariant Don + Doff = 1. As in [24], the initial state is given as [10,4,1,0].

We first consider the slow switching rates that lead to a more complicated dynamics including bimodal distributions. In order to demonstrate the refinement step and its effect on the accuracy of the model, we start with a very coarse abstraction. It distinguishes only the zero population and the nonzero populations and thus it is not able to adequately capture the relationship between the DNA state and RNA/P population. The pruned abstract model obtained using Algorithm 1 and 2 is depicted in Fig. 3 (left). The full one before pruning is shown in Fig. 6 [11, Appendix].

The proposed analysis of the model identifies the key trends in the system dynamic. The red transitions, representing iterations 1–3, capture the most probable paths in the system. The green component includes states with DNA on **Table 1.** Gene expression. For slow DNA switching, *r*<sup>1</sup> = *r*<sup>2</sup> = 0*.*05. For fast DNA switching, *r*<sup>1</sup> = *r*<sup>2</sup> = 1. The rates are in h−<sup>1</sup>.

$$\begin{array}{ccccc} \text{D}\_{\text{off}} \xrightarrow{r\_{1}} \text{D}\_{\text{on}} & \text{D}\_{\text{on}} \xrightarrow{r\_{2}} \text{D}\_{\text{off}} & \text{D}\_{\text{on}} \xrightarrow{10} \text{D}\_{\text{on}} + \text{RNA} & \text{RNA} \xrightarrow{\text{l}} \emptyset \\\text{RNA} \xrightarrow{4} \text{RNA} + \text{P} & \text{P} \xrightarrow{\text{l}} \emptyset & \text{P} + \text{D}\_{\text{off}} \xrightarrow{0.0015} \text{P} + \text{D}\_{\text{on}} \end{array}$$

**Fig. 3.** Pruned abstraction for the gene expression model using the coarse population discretisation (left) and after the refinement (right). The state vector is [P, RNA, Doff, Don].

(i.e. Don = 1) where the system oscillates. The component is reached via the blue state with Doff and no RNAs/P. The blue state is promptly reached from the initial state and then the system waits (roughly 100 h according our rate abstraction) for the next DNA activation. The oscillation is left via a deactivation in the iteration 4 (the blue dotted transition)<sup>5</sup>. The estimation of the exit time computed using Algorithm 2 is also 100 h. The deactivation is then followed by fast red transitions leading to the blue state, where the system waits for the next activation. Therefore, we obtain an oscillation between the blue state and the green component, representing the expected oscillation between the Don and Doff states.

As expected, this abstraction does not clearly predict the bimodal distribution on the RNA/P populations as the trivial population levels do not bear any information beside reaction enabledness. In order to obtain a more accurate analysis of the system, we refine the population discretisation using a single level threshold for P and DNA, that is equal to 100 and 10, respectively (the rates in the CRN indicate that the population of P reaches higher values).

Figure 3 (right) depicts the pruned abstract model with the new discretisation (the full model is depicted in Fig. 7 [11, Appendix]. We again obtain the oscillation between the green component representing DNAon states and the blue DNAoff state. The states in the green component more accurately predicts

<sup>5</sup> In Fig. 3, the dotted transitions denote exit transitions representing the deactivations.

that in the DNAon states the populations of RNA and P are high and drop to zero only for short time periods. The figure also shows orange transitions within the iteration 2 that extend the green component by two states. Note that the system promptly returns from these states back to the green component. After the deactivation in the iteration 4, the system takes (within the same iteration) the fast transitions (solid blue) leading to the blue component where system waits for another activation and where the mRNA/protein populations decrease. The expected time spent in states on blue solid transitions is small and thus we can reliably predict the bimodal distribution of the mRNA/P populations and its correlation with the DNA state. The refined abstraction also reveals that the switching time from the DNAon mode to the DNAoff mode is lower. These predications are in accordance with the results obtained in [24]. See Fig. 8 [11, Appendix] that is adopted from [24] and illustrates these results.

To further test the accuracy of our approach, we consider the fast switching between the DNA states. We follow the study in [24] and increase the rates by two orders of magnitude. We use the refined population discretisation and obtain a very similar abstraction as in Fig. 3 (right). We again obtain the oscillation between the green component (DNAon states and nonzero RNA/protein populations) and the blue state (DNAoff and zero RNA/protein populations). The only difference is in fact the transition rates corresponding to the activation and deactivation causing that the switching rate between the components is much faster. As a consequence, the system spends a longer period in the blue transient states with Doff and nonzero RNA/protein populations. The time spent in these states decreases the correlation between the DNA state and the RNA/protein populations as well as the bimodality in the population distribution. This is again in the accordance with [24].

To conclude this case study, we observe a very aligned agreement between the results obtained using our approach and results in [24] obtained via advanced and time consuming numerical methods. We would like to emphasise that our abstraction and its solution is obtained within a fraction of a second while the numerical methods have to approximate solutions of equations describing highorder conditional moments of the population distributions. As [24] does not report the runtime of the analysis and the implementation of their methods is not publicly available, we cannot directly compare the time complexity.

#### **5.2 Goutsias's Model**

Goutsias's model illustrated in Table 2 is widely used for evaluation of various numerical and simulation based techniques. As showed e.g. in [23], the system has with a high probability the following transient behaviour. In the first phase, the system switches with a high rate between the non-active DNA (denoted DNA) and the active DNA (DNA.D). During this phase the population of RNA, monomers (M) and dimers (D) gradually increase (with only negligible oscillations). After around 15 min, the DNA is blocked (DNA.2D) and the population of RNA decreases while the population of M and D is relatively stable. After all RNA degrades (around another 15 min) the system switches to the third **Table 2.** Goutsias' Model. The rates are in s−<sup>1</sup>

**Fig. 4.** Pruned abstraction for the Goutsias' model. The state vector is [M + D, RNA, DNA, DNA.D, DNA.2D]

phase where the population of M and D slowly decreases. Further, there is a non-negligible probability that the DNA is blocked at the beginning while the population of RNA is still small and the system promptly dies out.

Although the system is quite suitable for the hybrid approaches (there is no strong bimodality and only a limited stiffness), the analysis still takes 10 to 50 min depending on the required precision [27]. We demonstrate that our approach is able to accurately predict the main transient behaviour as well as the non-negligible probability that the system promptly dies out.

The state vector is given as [M, D, RNA, DNA, DNA.D, DNA.2D] and the initial state is set to [2, 6, 0, 1, 0, 0] as in [27]. We start our analysis with a coarse population discretisation with a single threshold 100 for M and D and a single threshold 10 for RNA. We relax the bounds, in particular, 1000 for M and D, and 100 for RNA. Note that these numbers were selected solely based on the rate orders of the relevant reactions. Note the DNA invariant DNA + DNA.D + DNA.2D = 1.

Figure 4 illustrates the pruned abstract model we obtained (the full model is depicted in Fig. 9 [11, Appendix]. For a better visualisation, we merged the state components corresponding to M and D into one component with M + D. As there is the fast reversible dimerisation, the actual distributions between the population of M and D does not affect the transient behaviour we are interested in.

The analysis of the model shows the following transient behaviour. The purple dotted loop in the iteration i1 represents (de-)activation of the DNA. The expected exit time of this loop is 100 s. According to our abstraction, there are two options (with the same probability) to exit the loop: (1) the path a represents the DNA blocking followed by the quick extinction and (2) the path b corresponds to the production of RNA and its followed by the red loop in the i2 that again represents (de-)activation of the DNA. Note that according our abstraction, this loop contains states with the populations of M/D as well as RNA up to 100 and 10, respectively.

The expected exit time of this loop is again 100 s and there are two options how to leave the loop: (1) the path within the iteration i3 (taken with roughly 90%) represents again the DNA blocking and it is followed by the extension of RNA and consequently by the extension of M/D in about 1000 s and (2) the path within the iteration 5 (shown in the full graph in Fig. 9 [11, Appendix]) taken with roughly 10% represents the series of protein productions and leads to the states with a high number of proteins (above 100 in our population discretisation). Afterwards, there is again a series of DNA (de-)activations followed by the DNA blocking and the extinction of RNA. As before, this leads to the extinction of M/D in about 1000 s.

Although this abstraction already shows the transient behaviour leading to the extinction in about 30 min, it introduces the following inaccuracy with respect to the known behaviour: (1) the probability of the fast extinction is higher and (2) we do not observe the clear bell-shape pattern on the RNA (i.e. the level 2 for the RNA is not reached in the abstraction). As in the previous case study, the problem is that the population discretisation is too coarse. It causes that the total rate of the DNA blocking (affected by the M/D population via the mass action kinetics) is too high in the states with the M/D population level 1. This can be directly seen in the interval CTMC representation where the rate spans many orders of magnitude, incurring too much imprecision. The refinement of the M/D population discretisation eliminates the first inaccuracy. To obtain the clear bell-shape patter on RNA, one has to refine also the RNA population discretisation.

#### **5.3 Viral Infection**

The viral infection model described in Table 3 represents the most challenging system we consider. It is highly stochastic, extremely stiff, with all species presenting high variance and some also very high molecular populations. Moreover, there is a bimodal distribution on the RNA population. As a consequence, the solution of the full CME, even using advanced reduction and aggregation techniques, is prohibitive due to state-space explosion and stochastic simulation are very time consuming. State-of-the-art hybrid approaches integrating the LNA and an adaptive population partitioning [9] can handle this system but also need a very long execution time. For example, a transient analysis up to time t = 50 requires around 20 min and up to t = 200 more than an hour.

To evaluate the accuracy of our approach on this challenging model, we also focus on the same transient analysis, namely, we are interested in the distribution of RNA at time t = 200. The analysis in [9] predicts a bimodal distribution where, the probability that RNA is zero in around 20% and the remaining probability has Gaussian distribution with mean around 17 and the probability that there **Table 3.** Viral Infection. The rates are day−<sup>1</sup>

**Fig. 5.** Pruned abstraction for the viral infection model. The state vector is [P, RNA, DNA].

is more than 30 RNAs is close to zero. This is confirmed by simulation-based analysis in [23] showing also the gradual growth of the RNA population. The simulation-based analysis in [43], however, estimates a lower probability (around 3%) that RNA is 0 and higher mean of the remaining Gaussian distribution (around 23). Recall that obtaining accurate results using simulations is extremely time consuming due to very stiff reactions (a single simulation for t = 200 takes around 20 s).

In the final experiments, we analyse the distribution of RNA at time t = 200 using our approach. The state vector is given as [P, RNA, DNA] and we start with the concrete state [0, 1, 0]. To sufficiently reason about the RNA population and to handle the very high population of the proteins, we use the following population discretisation: thresholds {10, 1000} for P, {10, 30} for RNA, and {10, 100} for DNA. As before, we use very relaxed bounds 10000, 100, and 1000 for P, RNA, and D, respectively. Note that we ignore the population of the virus V as it does not affect the dynamics of the other species. This simplification makes the visualisation of our approach more readable and has no effect on the complexity of the analysis.

Figure 5 illustrates the obtained abstract model enabling the following transient analysis (the full model is depicted in Fig. 10 [11, Appendix]. In a few days the system reaches from the initial state the loop (depicted by the purple dashed ellipse) within the iteration *i1*. The loop includes states where RNA has level 1, DNA has level 2 and P oscillates between the levels 2 and 3. Before entering the loop, there is a non-negligible probability (orders of percent) that the RNA drops to 0 via the full black branch that returns to transient part of the loop in *i1*. In this branch the system can also die out (not shown in this figure, see the full model) with probability in the order of tenths of percent.

The average exit time of the loop in *i1* is in the order of 10 days and the system goes to the yellow loop within the iteration *i2*, where the DNA level is increased to 3 (RNA level is unchanged and P again oscillates between the levels 2 and 3). The average exit time of the loop in *i2* is again in the order of 10 days and systems goes to the dotted red loop within iteration *i3*. The transition represents the sequence of RNA synthesis that leads to RNA level 2. P oscillates as before. Finally, the system leaves the loop in *i3* (this takes another dozen days) and reaches RNA level 3 in iterations *i4* and *i5* where the DNA level remains at the level 3 and P oscillates. The iteration *i4* and *i5* thus roughly correspond to the examined transient time t = 200.

The analysis clearly demonstrates that our approach leads to the behaviour that is well aligned with the previous experiments. We observed growth of the RNA population with a non-negligible probability of its extinction. The concrete quantities (i.e. the probability of the extinction and the mean RNA population) are closer to the analysis in [43]. The quantities are indeed affected by the population discretisation and can be further refined. We would like to emphasise that in contrast to the methods presented in [9,23,43] requiring hours of intensive numerical computation, our approach can be done even manually on the paper.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **PAC Statistical Model Checking for Markov Decision Processes and Stochastic Games**

Pranav Ashok, Jan Kˇret´ınsk´y, and Maximilian Weininger(B)

Technical University of Munich, Munich, Germany maxi.weininger@tum.de

**Abstract.** Statistical model checking (SMC) is a technique for analysis of probabilistic systems that may be (partially) unknown. We present an SMC algorithm for (unbounded) reachability yielding probably approximately correct (PAC) guarantees on the results. We consider both the setting (i) with no knowledge of the transition function (with the only quantity required a bound on the minimum transition probability) and (ii) with knowledge of the topology of the underlying graph. On the one hand, it is the first algorithm for stochastic games. On the other hand, it is the first practical algorithm even for Markov decision processes. Compared to previous approaches where PAC guarantees require running times longer than the age of universe even for systems with a handful of states, our algorithm often yields reasonably precise results within minutes, not requiring the knowledge of mixing time.

#### **1 Introduction**

**Statistical model checking (SMC)** [YS02a] is an analysis technique for probabilistic systems based on


One of the advantages is that it can avoid the state-space explosion problem, albeit at the cost of weaker guarantees. Even more importantly, this technique is applicable even when the model is not known (*black-box* setting) or only

This research was funded in part by TUM IGSSE Grant 10.06 (PARSEC), the Czech Science Foundation grant No. 18-11193S, and the German Research Foundation (DFG) project KR 4890/2-1 "Statistical Unbounded Verification".

qualitatively known (*grey-box* setting), where the exact transition probabilities are unknown such as in many cyber-physical systems.

In the basic setting of Markov chains [Nor98] with (time- or step-)bounded properties, the technique is very efficient and has been applied to numerous domains, e.g. biological [JCL+09,PGL+13], hybrid [ZPC10,DDL+12,EGF12, Lar12] or cyber-physical [BBB+10,CZ11,DDL+13] systems and a substantial tool support is available [JLS12,BDL+12,BCLS13,BHH12]. In contrast, whenever either (i) infinite time-horizon properties, e.g. reachability, are considered or (ii) non-determinism is present in the system, providing any guarantees becomes significantly harder.

Firstly, for *infinite time-horizon properties* we need a stopping criterion such that the infinite-horizon property can be reliably evaluated based on a finite prefix of the run yielded by simulation. This can rely on the the complete knowledge of the system (*white-box* setting) [YCZ10,LP08], the topology of the system (grey box) [YCZ10,HJB+10], or a lower bound pmin on the minimum transition probability in the system (black box) [DHKP16,BCC+14].

Secondly, for Markov decision processes (MDP) [Put14] with (non-trivial) *non-determinism*, [HMZ+12] and [LP12] employ reinforcement learning [SB98] in the setting of bounded properties or discounted (and for the purposes of approximation thus also bounded) properties, respectively. The latter also yields PAC guarantees.

Finally, for MDP with unbounded properties, [BFHH11] deals with MDP with spurious non-determinism, where the way it is resolved does not affect the desired property. The general non-deterministic case is treated in [FT14, BCC+14], yielding PAC guarantees. However, the former requires the knowledge of mixing time, which is at least as hard to compute; the algorithm in the latter is purely theoretical since before a single value is updated in the learning process, one has to simulate longer than the age of universe even for a system as simple as a Markov chain with 12 states having at least 4 successors for some state.

**Our contribution** is an SMC algorithm with PAC guarantees for (i) MDP and unbounded properties, which runs for realistic benchmarks [HKP+19] and confidence intervals in orders of minutes, and (ii) is the first algorithm for stochastic games (SG). It relies on different techniques from literature.

	- extending early detection of bottom strongly connected components in Markov chains by [DHKP16] to end components for MDP and simple end components for SG;
	- improving the underlying PAC Q-learning technique of [SLW+06]:
		- (a) learning is now model-based with better information reuse instead of model-free, but in realistic settings with the same memory requirements,
		- (b) better guidance of learning due to interleaving with precise computation, which yields more precise value estimates.
		- (c) splitting confidence over all relevant transitions, allowing for variable width of confidence intervals on the learnt transition probabilities.

2. The transition from algorithms for MDP to SG is possible via extending the over-approximating value iteration from MDP [BCC+14] to SG by [KKKW18].

To summarize, we give an anytime PAC SMC algorithm for (unbounded) reachability. It is the first such algorithm for SG and the first practical one for MDP.

#### **Related Work**

Most of the previous efforts in SMC have focused on the analysis of properties with *bounded* horizon [YS02a,SVA04,YKNP06,JCL+09,JLS12,BDL+12].

SMC of *unbounded* properties was first considered in [HLMP04] and the first approach was proposed in [SVA05], but observed incorrect in [HJB+10]. Notably, in [YCZ10] two approaches are described. The first approach proposes to terminate sampled paths at every step with some probability pterm and reweight the result accordingly. In order to guarantee the asymptotic convergence of this method, the second eigenvalue λ of the chain and its mixing time must be computed, which is as hard as the verification problem itself and requires the complete knowledge of the system (white box setting). The correctness of [LP08] relies on the knowledge of the second eigenvalue λ, too. The second approach of [YCZ10] requires the knowledge of the chain's topology (grey box), which is used to transform the chain so that all potentially infinite paths are eliminated. In [HJB+10], a similar transformation is performed, again requiring knowledge of the topology. In [DHKP16], only (a lower bound on) the minimum transition probability pmin is assumed and PAC guarantees are derived. While unbounded properties cannot be analyzed without any information on the system, knowledge of pmin is a relatively light assumption in many realistic scenarios [DHKP16]. For instance, bounds on the rates for reaction kinetics in chemical reaction systems are typically known; for models in the PRISM language [KNP11], the bounds can be easily inferred without constructing the respective state space. In this paper, we thus adopt this assumption.

In the case with general *non-determinism*, one approach is to give the nondeterminism a probabilistic semantics, e.g., using a uniform distribution instead, as for timed automata in [DLL+11a,DLL+11b,Lar13]. Others [LP12,HMZ+12, BCC+14] aim to quantify over all strategies and produce an -optimal strategy. In [HMZ+12], candidates for optimal strategies are generated and gradually improved, but "at any given point we cannot quantify how close to optimal the candidate scheduler is" (cited from [HMZ+12]) and the algorithm "does not in general converge to the true optimum" (cited from [LST14]). Further, [LST14,DLST15,DHS18] randomly sample compact representation of strategies, resulting in useful lower bounds if ε-schedulers are frequent. [HPS+19] gives a convergent model-free algorithm (with no bounds on the current error) and identifies that the previous [SKC+14] "has two faults, the second of which also affects approaches [...] [HAK18,HAK19]".

Several approaches provide SMC for MDPs and unbounded properties with *PAC guarantees*. Firstly, similarly to [LP08,YCZ10], [FT14] requires (1) the mixing time T of the MDP. The algorithm then yields PAC bounds in time polynomial in T (which in turn can of course be exponential in the size of the MDP). Moreover, the algorithm requires (2) the ability to restart simulations also in non-initial states, (3) it only returns the strategy once all states have been visited (sufficiently many times), and thus (4) requires the size of the state space |S|. Secondly, [BCC+14], based on delayed Q-learning (DQL) [SLW+06], lifts the assumptions (2) and (3) and instead of (1) mixing time requires only (a bound on) the minimum transition probability pmin. Our approach additionally lifts the assumption (4) and allows for running times faster than those given by T, even without the knowledge of T.

Reinforcement learning (without PAC bounds) for stochastic games has been considered already in [LN81,Lit94,BT99]. [WT16] combines the special case of almost-sure satisfaction of a specification with optimizing quantitative objectives. We use techniques of [KKKW18], which however assumes access to the transition probabilities.

#### **2 Preliminaries**

#### **2.1 Stochastic Games**

A *probability distribution* on a finite set X is a mapping δ : X → [0, 1], such that - <sup>x</sup>∈<sup>X</sup> <sup>δ</sup>(x) = 1. The set of all probability distributions on <sup>X</sup> is denoted by D(X). Now we define turn-based two-player stochastic games. As opposed to the notation of e.g. [Con92], we do not have special stochastic nodes, but rather a probabilistic transition function.

**Definition 1 (SG).** *A* stochastic game (SG) is a tuple G = (S, S-, <sup>S</sup>-,s0,A,Av,T), where S is a finite set of *states* partitioned<sup>1</sup> into the sets S and <sup>S</sup> of states of the player *Maximizer* and *Minimizer* <sup>2</sup>, respectively <sup>s</sup><sup>0</sup> <sup>∈</sup> <sup>S</sup> is the *initial* state, <sup>A</sup> is a finite set of *actions*, Av : <sup>S</sup> <sup>→</sup> <sup>2</sup><sup>A</sup> assigns to every state a set of *available* actions, and <sup>T</sup> : <sup>S</sup> <sup>×</sup> <sup>A</sup> → D(S) is a *transition function* that given a state s and an action a ∈ Av(s) yields a probability distribution over *successor* states. Note that for ease of notation we write T(s, a,t) instead of T(s, a)(t).

A Markov decision process (MDP) is a special case of SG where <sup>S</sup>- = ∅. A Markov chain (MC) can be seen as a special case of an MDP, where for all s ∈ S : |Av(s)| = 1. We assume that SG are non-blocking, so for all states s we have Av(s) = ∅.

For a state s and an available action a ∈ Av(s), we denote the set of successors by Post(s, <sup>a</sup>) := {<sup>t</sup> <sup>|</sup> <sup>T</sup>(s, <sup>a</sup>,t) <sup>&</sup>gt; <sup>0</sup>}. We say a state-action pair (s, <sup>a</sup>) is an *exit* of a set of states T, written (s, a) exits T, if ∃t ∈ Post(s, a) : t ∈/ T, i.e., if with some probability a successor outside of T could be chosen.

We consider algorithms that have a limited information about the SG.

<sup>1</sup> I.e., <sup>S</sup>- ⊆ S, S- ⊆ S, S- ∪ S- = S, and S-∩ S-

<sup>=</sup> <sup>∅</sup>. <sup>2</sup> The names are chosen, because Maximizer maximizes the probability of reaching a given target state, and Minimizer minimizes it.

**Definition 2 (Black box and grey box).** *An algorithm inputs an SG as* black box *if it cannot access the whole tuple, but*


*imum transition probability.*

*When input as* grey box *it additionally knows the number* |Post(s, a)| *of successors for each state* s *and action* a*.* 4

The semantics of SG is given in the usual way by means of strategies and the induced Markov chain [BK08] and its respective probability space, as follows. An *infinite path* <sup>ρ</sup> is an infinite sequence <sup>ρ</sup> <sup>=</sup> <sup>s</sup>0a0s1a<sup>1</sup> ···∈ (<sup>S</sup> <sup>×</sup> <sup>A</sup>)<sup>ω</sup>, such that for every <sup>i</sup> <sup>∈</sup> <sup>N</sup>, <sup>a</sup><sup>i</sup> <sup>∈</sup> Av(si) and <sup>s</sup>i+1 <sup>∈</sup> Post(si, <sup>a</sup>i).

A *strategy* of Maximizer or Minimizer is a function σ : S- → D(A) or <sup>S</sup>- → D(A), respectively, such that σ(s) ∈ D(Av(s)) for all s. Note that we restrict to memoryless/positional strategies, as they suffice for reachability in SGs [CH12].

A pair (σ, τ ) of strategies of Maximizer and Minimizer induces a Markov chain Gσ,τ with states S, s - <sup>0</sup> being initial, and the transition function T(s)(t) = <sup>a</sup>∈Av(s) <sup>σ</sup>(s)(a)· <sup>T</sup>(s, <sup>a</sup>,t) for states of Maximizer and analogously for states of Minimizer, with σ replaced by τ . The Markov chain induces a unique probability distribution Pσ,τ over measurable sets of infinite paths [BK08, Ch. 10].

#### **2.2 Reachability Objective**

For a goal set Goal <sup>⊆</sup> <sup>S</sup>, we write ♦Goal := {s0a0s1a<sup>1</sup> ··· | ∃<sup>i</sup> <sup>∈</sup> <sup>N</sup> : <sup>s</sup><sup>i</sup> <sup>∈</sup> Goal} to denote the (measurable) set of all infinite paths which eventually reach Goal. For each s ∈ S, we define the *value* in s as

$$\mathcal{V}(\mathbf{s}) := \sup\_{\sigma} \inf\_{\tau} \mathbb{P}\_s^{\sigma, \tau}(\diamondsuit \mathbf{Goal}) = \inf\_{\tau} \sup\_{\sigma} \mathbb{P}\_s^{\sigma, \tau}(\diamondsuit \mathbf{Goal}),$$

where the equality follows from [Mar75]. We are interested in V(s0), its ε-approximation and the corresponding (ε-)optimal strategies for both players.

<sup>3</sup> Up to this point, this definition conforms to black box systems in the sense of [SVA04] with sampling from the initial state, being slightly stricter than [YS02a] or [RP09], where simulations can be run from any desired state. Further, we assume that we can choose actions for the adversarial player or that she plays fairly. Otherwise the adversary could avoid playing her best strategy during the SMC, not giving SMC enough information about her possible behaviours. <sup>4</sup> This requirement is slightly weaker than the knowledge of the whole topology, i.e.

Post(s, <sup>a</sup>) for each <sup>s</sup> and <sup>a</sup>.

Let Zero be the set of states, from which there is no finite path to any state in Goal. The value function V satisfies the following system of equations, which is referred to as the *Bellman equations*:

$$\mathsf{V}(\mathsf{s}) = \begin{cases} \max\_{\mathsf{a} \in \mathsf{Av}(\mathsf{s})} \mathsf{V}(\mathsf{s}, \mathsf{a}) & \text{if } \mathsf{s} \in \mathsf{S}\_{\square} \\ \min\_{\mathsf{a} \in \mathsf{Av}(\mathsf{s})} \mathsf{V}(\mathsf{s}, \mathsf{a}) & \text{if } \mathsf{s} \in \mathsf{S}\_{\square} \\ 1 & \text{if } \mathsf{s} \in \mathsf{Goal} \\ 0 & \text{if } \mathsf{s} \in \mathsf{Zero} \end{cases}$$

with the abbreviation V(s, a) := - s-<sup>∈</sup><sup>S</sup> <sup>T</sup>(s, <sup>a</sup>,s )·V(s ). Moreover, V is the *least* solution to the Bellman equations, see e.g. [CH08].

#### **2.3 Bounded and Asynchronous Value Iteration**

The well known technique of value iteration, e.g. [Put14,RF91], works by starting from an under-approximation of value function and then applying the Bellman equations. This converges towards the least fixpoint of the Bellman equations, i.e. the *value function*. Since it is difficult to give a convergence criterion, the approach of bounded value iteration (BVI, also called interval iteration) was developed for MDP [BCC+14,HM17] and SG [KKKW18]. Beside the underapproximation, it also updates an over-approximation according to the Bellman equations. The most conservative over-approximation is to use an upper bound of 1 for every state. For the under-approximation, we can set the lower bound of target states to 1; all other states have a lower bound of 0. We use the function INITIALIZE BOUNDS in our algorithms to denote that the lower and upper bounds are set as just described; see [AKW19, Algorithm 8] for the pseudocode. Additionally, BVI ensures that the over-approximation converges to the least fixpoint by taking special care of *end components*, which are the reason for not converging to the true value from above.

**Definition 3 (End component (EC)).** *A non-empty set* T ⊆ S *of states is an* end component (EC) *if there is a non-empty set* B ⊆ <sup>s</sup>∈<sup>T</sup> Av(s) *of actions such that (i) for each* s ∈ T, a ∈ B ∩ Av(s) *we do* not *have* (s, a) exits T *and (ii) for each* s,s ∈ T *there is a finite path* w = sa<sup>0</sup> ... ans ∈ (T × B)<sup>∗</sup> × T*, i.e. the path stays inside* T *and only uses actions in* B*.*

Intuitively, ECs correspond to bottom strongly connected components of the Markov chains induced by possible strategies, so for some pair of strategies all possible paths starting in the EC remain there. An end component T is a *maximal end component (MEC)* if there is no other end component T such that T ⊆ T . Given an SG G, the set of its MECs is denoted by MEC(G).

Note that, to stay in an EC in an SG, the two players would have to cooperate, since it depends on the pair of strategies. To take into account the adversarial behaviour of the players, it is also relevant to look at a subclass of ECs, the so called *simple end components*, introduced in [KKKW18].

**Definition 4 (Simple end component (SEC)** [KKKW18]**).** *An EC* T *is called* simple*, if for all* s ∈ T *it holds that* V(s) = bestExit(T,V)*, where*

> bestExit(T,f) := ⎧ ⎨ ⎩ 1 *if* T ∩ Goal = ∅ max <sup>s</sup>∈T∩S- (s,a) exits T f(s, a) *else*

*is called the* best exit *(of Maximizer) from* <sup>T</sup> *according to the function* <sup>f</sup> : <sup>S</sup> <sup>→</sup> <sup>R</sup>*. To handle the case that there is no exit of Maximizer in* T *we set* max<sup>∅</sup> = 0*.*

Intuitively, SECs are ECs where Minimizer does not want to use any of her exits, as all of them have a greater value than the best exit of Maximizer. Assigning any value between those of the best exits of Maximizer and Minimizer to all states in the EC is a solution to the Bellman equations, because both players prefer remaining and getting that value to using their exits [KKKW18, Lemma 1]. However, this is suboptimal for Maximizer, as the goal is not reached if the game remains in the EC forever. Hence we "deflate" the upper bounds of SECs, i.e. reduce them to depend on the best exit of Maximizer. T is called maximal simple end component (MSEC), if there is no SEC T such that T - T . Note that in MDPs, treating all MSECs amounts to treating all MECs.


Algorithm 1 rephrases that of [KKKW18] and describes the general structure of all bounded value iteration algorithms that are relevant for this paper. We discuss it here since all our improvements refer to functions (in capitalized font) in it. In the next section, we design new functions, pinpointing the difference to the other papers. The pseudocode of the functions adapted from the other papers can be found, for the reader's convenience, in [AKW19, Appendix A]. Note that to improve readability, we omit the parameters G,Goal, L and U of the functions in the algorithm.

**Bounded Value Iteration:** For the standard bounded value iteration algorithm, Line 4 does not run a simulation, but just assigns the whole state space S to X<sup>5</sup>. Then it updates all values according to the Bellman equations.

<sup>5</sup> Since we mainly talk about simulation based algorithms, we included this line to make their structure clearer.

After that it finds all the problematic components, the MSECs, and "deflates" them as described in [KKKW18], i.e. it reduces their values to ensure the convergence to the least fixpoint. This suffices for the bounds to converge and the algorithm to terminate [KKKW18, Theorem 2].

**Asynchronous Bounded Value Iteration:** To tackle the state space explosion problem, *asynchronous* simulation/learning-based algorithms have been developed [MLG05,BCC+14,KKKW18]. The idea is not to update and deflate all states at once, since there might be too many, or since we only have limited information. Instead of considering the whole state space, a path through the SG is sampled by picking in every state one of the actions that look optimal according to the current over-/under-approximation and then sampling a successor of that action. This is repeated until either a target is found, or until the simulation is looping in an EC; the latter case occurs if the heuristic that picks the actions generates a pair of strategies under which both players only pick staying actions in an EC. After the simulation, only the bounds of the states on the path are updated and deflated. Since we pick actions which look optimal in the simulation, we almost surely find an -optimal strategy and the algorithm terminates [BCC+14, Theorem 3].

#### **3 Algorithm**

#### **3.1 Model-Based**

Given only limited information, updating cannot be done using T, since the true probabilities are not known. The approach of [BCC+14] is to sample for a high number of steps and accumulate the observed lower and upper bounds on the true value function for each state-action pair. When the number of samples is large enough, the average of the accumulator is used as the new estimate for the state-action pair, and thus the approximations can be improved and the results back-propagated, while giving statistical guarantees that each update was correct. However, this approach has several drawbacks, the biggest of which is that the number of steps before an update can occur is infeasibly large, often larger than the age of the universe, see Table 1 in Sect. 4.

Our improvements to make the algorithm practically usable are linked to constructing a partial model of the given system. That way, we have more information available on which we can base our estimates, and we can be less conservative when giving bounds on the possible errors. The shift from model-free to model-based learning asymptotically increases the memory requirements from O(|S|·|A|) (as in [SLW+06,BCC+14]) to O(|S| <sup>2</sup> · |A|). However, for systems where each action has a small constant bound on the number of successors, which is typical for many practical systems, e.g. classical PRISM benchmarks, it is still O(|S|·|A|) with a negligible constant difference.

We thus track the number of times some successor t has been observed when playing action a from state s in a variable #(s, a,t). This implicitly induces the number of times each state-action pair (s, a) has been played #(s, a) = - <sup>t</sup>∈<sup>S</sup> #(s, <sup>a</sup>,t). Given these numbers we can then calculate probability estimates for every transition as described in the next subsection. They also induce the set of all states visited so far, allowing us to construct a partial model of the game. See [AKW19, Appendix A.2] for the pseudo-code of how to count the occurrences during the simulations.

#### **3.2 Safe Updates with Confidence Intervals Using Distributed Error Probability**

We use the counters to compute a lower estimate of the transition probability for some error tolerance δ<sup>T</sup> as follows: We view sampling t from state-action pair (s, a) as a Bernoulli sequence, with success probability T(s, a,t), the number of trials #(s, a) and the number of successes #(s, a,t). The tightest lower estimate we can give using the Hoeffding bound (see [AKW19, Appendix D.1]) is

$$\widehat{\mathbb{T}}(\mathbf{s}, \mathbf{a}, \mathbf{t}) := \max(0, \frac{\#(s, a, t)}{\#(s, a)} - c), \tag{1}$$

where the confidence width c := ln(δ<sup>T</sup> ) <sup>−</sup>2#(s,a) . Since <sup>c</sup> could be greater than 1, we limit the lower estimate to be at least 0. Now we can give modified update equations:

$$\begin{aligned} \widehat{\mathsf{L}}(\mathsf{s},\mathsf{a}) &:= \sum\_{\mathsf{t}: \#(\mathsf{s},\mathsf{a},\mathsf{t}) > 0} \widehat{\mathsf{T}}(\mathsf{s},\mathsf{a},\mathsf{t}) \cdot \mathsf{L}(\mathsf{t}) \\ \widehat{\mathsf{U}}(\mathsf{s},\mathsf{a}) &:= \left( \sum\_{\mathsf{t}: \#(\mathsf{s},\mathsf{a},\mathsf{t}) > 0} \widehat{\mathsf{T}}(\mathsf{s},\mathsf{a},\mathsf{t}) \cdot \mathsf{U}(\mathsf{t}) \right) + \left( 1 - \sum\_{\mathsf{t}: \#(\mathsf{s},\mathsf{a},\mathsf{t}) > 0} \widehat{\mathsf{T}}(\mathsf{s},\mathsf{a},\mathsf{t}) \right). \end{aligned}$$

The idea is the same for both upper and lower bound: In contrast to the usual Bellman equation (see Sect. 2.2) we use <sup>T</sup> instead of <sup>T</sup>. But since the sum of all the lower estimates does not add up to one, there is some remaining probability for which we need to under-/over-approximate the value it can achieve. We use

**Fig. 1.** A running example of an SG. The dashed part is only relevant for the later examples. For actions with only one successor, we do not depict the transition probability 1 (e.g. <sup>T</sup>(s<sup>0</sup>, <sup>a</sup><sup>1</sup>,s1)). For state-action pair (s<sup>1</sup>, <sup>b</sup>2), the transition probabilities are parameterized and instantiated in the examples where they are used.

the safe approximations 0 and 1 for the lower and upper bound respectively; this is why in L there is no second term and in <sup>U</sup> the whole remaining probability is added. Algorithm 2 shows the modified update that uses the lower estimates; the proof of its correctness is in [AKW19, Appendix D.2].

**Lemma 1 (**UPDATE **is correct).** *Given correct under- and over-approximations* L,U *of the value function* V*, and correct lower probability estimates* <sup>T</sup>*, the under- and over-approximations after an application of* UPDATE *are also correct.*


*Example 1.* We illustrate how the calculation works and its huge advantage over the approach from [BCC+14] on the SG from Fig. 1. For this example, ignore the dashed part and let p<sup>1</sup> = p<sup>2</sup> = 0.5, i.e. we have no self loop, and an even chance to go to the target 1 or a sink 0. Observe that hence V(s0) = V(s1)=0.5.

Given an error tolerance of δ = 0.1, the algorithm of [BCC+14] would have to sample for more than 10<sup>9</sup> steps before it could attempt a single update. In contrast, assume we have seen 5 samples of action b2, where 1 of them went to 1 and 4 of them to 0. Note that, in a sense, we were unlucky here, as the observed averages are very different from the actual distribution. The confidence width for δ<sup>T</sup> = 0.1 and 5 samples is ln(0.1)/ − 2 · 5 ≈ 0.48. So given that data, we get <sup>T</sup>(s1, <sup>b</sup>2, <sup>1</sup>) = max(0, <sup>0</sup>.2−0.48) = 0 and <sup>T</sup>(s1, <sup>b</sup>2, <sup>0</sup>) = max(0, <sup>0</sup>.8−0.48) = 0.32. Note that both probabilities are in fact lower estimates for their true counterpart.

Assume we already found out that 0 is a sink with value 0; how we gain this knowledge is explained in the following subsections. Then, after getting only these 5 samples, UPDATE already decreases the upper bound of (s1, b2) to 0.68, as we know that at least 0.32 of T(s1, b2) goes to the sink.

Given 500 samples of action b2, the confidence width of the probability estimates already has decreased below 0.05. Then, since we have this confidence width for both the upper and the lower bound, we can decrease the total precision for (s1, b2) to 0.1, i.e. return an interval in the order of [0.45; 0.55]. 

Summing up: with the model-based approach we can already start updating after very few steps and get a reasonable level of confidence with a realistic number of samples. In contrast, the state-of-the-art approach of [BCC+14] needs a very large number of samples even for this toy example.

Since for UPDATE we need an error tolerance for every transition, we need to distribute the given total error tolerance δ over all transitions in the current partial model. For all states in the explored partial model <sup>S</sup> we know the number of available actions and can over-approximate the number of successors as <sup>1</sup> <sup>p</sup>min . Thus the error tolerance for each transition can be set to δ<sup>T</sup> := <sup>δ</sup>·pmin |{a|s∈<sup>S</sup> -<sup>∧</sup>a∈Av(s)}| . This is illustrated in Example 4 in [AKW19, Appendix B].

Note that the fact that the error tolerance δ<sup>T</sup> for every transition is the same does *not* imply that the confidence width for every transition is the same, as the latter becomes smaller with increasing number of samples #(s, a).

#### **3.3 Improved EC Detection**

As mentioned in the description of Algorithm 1, we must detect when the simulation is stuck in a bottom EC and looping forever. However, we may also stop simulations that are looping in some EC but still have a possibility to leave it; for a discussion of different heuristics from [BCC+14,KKKW18], see [AKW19, Appendix A.3].

We choose to define LOOPING as follows: Given a candidate for a bottom EC, we continue sampling until we are δT*-sure* (i.e. the error probability is smaller than δT) that we cannot leave it. Then we can safely deflate the EC, i.e. decrease all upper bounds to zero.

To detect that something is a δT*-sure* EC, we do not sample for the astronomical number of steps as in [BCC+14], but rather extend the approach to detect bottom strongly connected components from [DHKP16]. If in the EC-candidate T there was some state-action pair (s, a) that actually has a probability to exit the T, that probability is at least pmin. So after sampling (s, a) for n times, the probability to overlook such a leaving transition is (1 <sup>−</sup> <sup>p</sup>min)<sup>n</sup> and it should be smaller than δT. Solving the inequation for the required number of samples n yields <sup>n</sup> <sup>≥</sup> ln(δ<sup>T</sup> ) ln(1−p*min*) .

Algorithm 3 checks that we have seen all staying state-action pairs n times, and hence that we are δT*-sure* that T is an EC. Note that we restrict to staying state-action pairs, since the requirement for an EC is only that there exist staying actions, not that all actions stay. We further speed up the EC-detection, because we do not wait for n samples in every simulation, but we use the aggregated counters that are kept over all simulations.



We stop a simulation, if LOOPING returns true, i.e. under the following three conditions: (i) We have seen the current state before in this simulation (s ∈ X), i.e. there is a cycle. (ii) This cycle is explainable by an EC T in our current partial model. (iii) We are δT*-sure* that T is an EC.


*Example 2.* For this example, we again use the SG from Fig. 1 without the dashed part, but this time with p<sup>1</sup> = p<sup>2</sup> = p<sup>3</sup> = <sup>1</sup> <sup>3</sup> . Assume the path we simulated is (s0, a1,s1, b2,s1), i.e. we sampled the self-loop of action b2. Then {s1} is a candidate for an EC, because given our current observation it seems possible that we will continue looping there forever. However, we do not stop the simulation here, because we are not yet δT*-sure* about this. Given δ<sup>T</sup> = 0.1, the required samples for that are 6, since ln(0.1) ln(1<sup>−</sup> <sup>1</sup> <sup>3</sup> ) = 5.6. With high probability (greater than (1 − δT)=0.9), within these 6 steps we will sample one of the other successors of (s1, b2) and thus realise that we should not stop the simulation in s1. If, on the other hand, we are in state 0 or if in state s<sup>1</sup> the guiding heuristic only picks b1, then we are in fact looping for more than 6 steps, and hence we stop the simulation. 

#### **3.4 Adapting to Games: Deflating MSECs**

To extend the algorithm of [BCC+14] to SGs, instead of collapsing problematic ECs we deflate them as in [KKKW18], i.e. given an MSEC, we reduce the upper bound of all states in it to the upper bound of the bestExit of Maximizer. In contrast to [KKKW18], we cannot use the upper bound of the bestExit based on the true probability, but only based on our estimates. Algorithm 5 shows how to deflate an MSEC and highlights the difference, namely that we use <sup>U</sup> instead of U.


The remaining question is how to find MSECs. The approach of [KKKW18] is to find MSECs by removing the suboptimal actions of Minimizer according to the current lower bound. Since it converges to the true value function, all MSECs are eventually found [KKKW18, Lemma 2]. Since Algorithm 6 can only access the SG as a black box, there are two differences: We can only compare our estimates of the lower bound L (s, <sup>a</sup>) to find out which actions are suboptimal. Additionally there is the problem that we might overlook an exit from an EC, and hence deflate to some value that is too small; thus we need to check that any state set FIND MSECs returns is a δT*-sure* EC. This is illustrated in Example 3. For a bigger example of how all our functions work together, see Example 5 in [AKW19, Appendix B].


*Example 3.* For this example, we use the full SG from Fig. 1, including the dashed part, with p1, p<sup>2</sup> > 0. Let (s0, a1,s1, b2,s2, b1,s1, a2,s2, c, 1) be the path generated by our simulation. Then in our partial view of the model, it seems as if T = {s0,s1} is an MSEC, since using a<sup>2</sup> is suboptimal for the minimizing state s<sup>0</sup> <sup>6</sup> and according to our current knowledge a1, b<sup>1</sup> and b<sup>2</sup> all stay inside T. If we deflated T now, all states would get an upper bound of 0, which would be incorrect.

Thus in Algorithm 6 we need to require that T is an EC δT*-surely*. This was not satisfied in the example, as the state-action pairs have not been observed the required number of times. Thus we do not deflate T, and our upper bounds stay correct. Having seen (s1, b2) the required number of times, we probably know that it is exiting T and hence will not make the mistake. 

#### **3.5 Guidance and Statistical Guarantee**

It is difficult to give statistical guarantees for the algorithm we have developed so far (i.e. Algorithm 1 calling the new functions from Sects. 3.2, 3.3 and 3.4). Although we can bound the error of each function, applying them repeatedly can add up the error. Algorithm 7 shows our approach to get statistical guarantees: It interleaves a guided simulation phase (Lines 7–10) with a guaranteed standard bounded value iteration (called BVI phase) that uses our new functions (Lines 11–16).

The simulation phase builds the partial model by exploring states and remembering the counters. In the first iteration of the main loop, it chooses actions randomly. In all further iterations, it is guided by the bounds that the last BVI

<sup>6</sup> For <sup>δ</sup><sup>T</sup> = 0.2, sampling the path to target once suffices to realize that <sup>L</sup>(s<sup>0</sup>, <sup>a</sup>2) <sup>&</sup>gt; 0.

phase computed. After N<sup>k</sup> simulations (see below for a discussion of how to choose Nk), all the gathered information is used to compute one version of the partial model with probability estimates <sup>T</sup> for a certain error tolerance <sup>δ</sup>k. We can continue with the assumption, that these probability estimates are correct, since it is only violated with a probability smaller than our error tolerance (see below for an explanation of the choice of δk). So in our correct partial model, we re-initialize the lower and upper bound (Line 12), and execute a guaranteed standard BVI. If the simulation phase already gathered enough data, i.e. explored the relevant states and sampled the relevant transitions often enough, this BVI achieves a precision smaller than ε in the initial state, and the algorithm terminates. Otherwise we start another simulation phase that is guided by the improved bounds.


**Choice of** δk: For each of the full BVI phases, we construct a partial model that is correct with probability (1 − δk). To ensure that the sum of these errors is not larger than the specified error tolerance δ, we use the variable k, which is initialised to 1 and doubled in every iteration of the main loop. Hence for the i-th BVI, k = 2<sup>i</sup> . By setting δ<sup>k</sup> = <sup>δ</sup> <sup>k</sup> , we get that ∞ i=1 δ<sup>k</sup> = ∞ i=1 δ <sup>2</sup><sup>i</sup> <sup>=</sup> <sup>δ</sup>, and hence the error of all BVI phases does not exceed the specified error tolerance.

**When to Stop Each BVI-Phase:** The BVI phase might not converge if the probability estimates are not good enough. We increase the number of iterations for each BVI depending on k, because that way we ensure that it eventually is allowed to run long enough to converge. On the other hand, since we always run for finitely many iterations, we also ensure that, if we do not have enough information yet, BVI is eventually stopped. Other stopping criteria could return arbitrarily imprecise results [HM17]. We also multiply with S to improve the chances of the early BVIs to converge, as that number of iterations ensures that every value has been propagated through the whole model at least once.

**Discussion of the Choice of** Nk: The number of simulations between the guaranteed BVI phases can be chosen freely; it can be a constant number every time, or any sequence of natural numbers, possibly parameterised by e.g. k, S , ε or any of the parameters of G. The design of particularly efficient choices or learning mechanisms that adjust them on the fly is an interesting task left for future work. We conjecture the answer depends on the given SG and "task" that the user has for the algorithm: E.g. if one just needs a quick general estimate of the behaviour of the model, a smaller choice of N<sup>k</sup> is sensible; if on the other hand a definite precision ε certainly needs to be achieved, a larger choice of N<sup>k</sup> is required.

**Theorem 1.** *For any choice of sequence for* Nk*, Algorithm 7 is an anytime algorithm with the following property: When it is stopped, it returns an interval for* V(s0) *that is PAC*<sup>7</sup> *for the given error tolerance* δ *and some* ε *, with* 0 ≤ ε ≤ 1*.*

Theorem 1 is the foundation of the practical usability of our algorithm. Given some time frame and some Nk, it calculates an approximation for V(s0) that is probably correct. Note that the precision ε is independent of the input parameter ε, and could in the worst case be always 1. However, practically it often is good (i.e. close to 0) as seen in the results in Sect. 4. Moreover, in our modified algorithm, we can also give a convergence guarantee as in [BCC+14]. Although mostly out of theoretical interest, in [AKW19, Appendix D.4] we design such a sequence Nk, too. Since this a-priori sequence has to work in the worst case, it depends on an infeasibly large number of simulations.

**Theorem 2.** *There exists a choice of* Nk*, such that Algorithm 7 is PAC for any input parameters* ε, δ*, i.e. it terminates almost surely and returns an interval for* V(s0) *of width smaller than* ε *that is correct with probability at least* 1 − δ*.*

<sup>7</sup> Probably Approximately Correct, i.e. with probability greater than 1 <sup>−</sup> δ, the value lies in the returned interval of width ε .

#### **3.6 Utilizing the Additional Information of Grey Box Input**

In this section, we consider the grey box setting, i.e. for every state-action pair (s, a) we additionally know the exact number of successors |Post(s, a)|. Then we can sample every state-action pair until we have seen all successors, and hence this information amounts to having qualitative information about the transitions, i.e. knowing where the transitions go, but not with which probability.

In that setting, we can improve the EC-detection and the estimated bounds in UPDATE. For EC-detection, note that the whole point of δT*-sure* EC is to check whether there are further transitions available; in grey box, we know this and need not depend on statistics. For the bounds, note that the equations for L and <sup>U</sup> both have two parts: The usual Bellman part and the remaining probability multiplied with the most conservative guess of the bound, i.e. 0 and 1. If we know all successors of a state-action pair, we do not have to be as conservative; then we can use min<sup>t</sup>∈Post(s,a) L(t) respectively max<sup>t</sup>∈Post(s,a) U(t). Both these improvements have huge impact, as demonstrated in Sect. 4. However, of course, they also assume more knowledge about the model.

#### **4 Experimental Evaluation**

We implemented the approach as an extension of PRISM-Games [CFK+13a]. 11 MDPs with reachability properties were selected from the Quantitative Verification Benchmark Set [HKP+19]. Further, 4 stochastic games benchmarks from [CKJ12,SS12,CFK+13b,CKPS11] were also selected. We ran the experiments on a 40 core Intel Xeon server running at 2.20 GHz per core and having 252 GB of RAM. The tool however utilised only a single core and 1 GB of memory for the model checking. Each benchmark was ran 10 times with a timeout of 30 min. We ran two versions of Algorithm 7, one with the SG as a black box, the other as a grey box (see Definition 2). We chose N<sup>k</sup> = 10, 000 for all iterations. The tool stopped either when a precision of 10−<sup>8</sup> was obtained or after 30 min. In total, 16 different model-property combinations were tried out. The results of the experiment are reported in Table 1.

In the black box setting, we obtained ε < 0.1 on 6 of the benchmarks. 5 benchmarks were 'hard' and the algorithm did not improve the precision below 1. For 4 of them, it did not even finish the first simulation phase. If we decrease Nk, the BVI phase is entered, but still no progress is made.

In the grey box setting, on 14 of 16 benchmarks, it took only 6 min to achieve ε < 0.1. For 8 these, the exact value was found within that time. Less than 50% of the state space was explored in the case of pacman, pneuli-zuck-3, rabin-3, zeroconf and cloud 5. A precision of ε < 0.01 was achieved on 15/16 benchmarks over a period of 30 min.

**Table 1.** Achieved precision <sup>ε</sup> given by our algorithm in both grey and black box settings after running for a period of 30 min (See the paragraph below Theorem 1 for why we use ε and not <sup>ε</sup>). The first set of the models are MDPs and the second set are SGs. '-' indicates that the algorithm did not finish the first simulation phase and hence partial BVI was not called. m is the number of steps required by the DQL algorithm of [BCC+14] before the first update. As this number is very large, we report only log<sup>10</sup>(m). For comparison, note that the age of the universe is approximately 10<sup>26</sup> ns; logarithm of number of steps doable in this time is thus in the order of 26.


Figure 2 shows the evolution of the lower and upper bounds in both the greyand the black box settings for 4 different models. Graphs for the other models as well as more details on the results are in [AKW19, Appendix C].

**Fig. 2.** Performance of our algorithm on various MDP and SG benchmarks in grey and black box settings. Solid lines denote the bounds in the grey box setting while dashed lines denote the bounds in the black box setting. The plotted bounds are obtained after each partial BVI phase, because of which they do not start at [0, 1] and not at time 0. Graphs of the remaining benchmarks may be found in [AKW19, Appendix C].

#### **5 Conclusion**

We presented a PAC SMC algorithm for SG (and MDP) with the reachability objective. It is the first one for SG and the first practically applicable one. Nevertheless, there are several possible directions for further improvements. For instance, one can consider different sequences for lengths of the simulation phases, possibly also dependent on the behaviour observed so far. Further, the error tolerance could be distributed in a non-uniform way, allowing for fewer visits in rarely visited parts of end components. Since many systems are strongly connected, but at the same time feature some infrequent behaviour, this is the next bottleneck to be attacked. [KM19]

#### **References**

[AKW19] Ashok, P., Kˇret´ınsk´y, J.: Maximilian Weininger. PAC statistical model checking for markov decision processes and stochastic games. Technical Report arXiv.org/abs/1905.04403 (2019)

	- [BHH12] Bogdoll, J., Hartmanns, A., Hermanns, H.: Simulation and statistical model checking for modestly nondeterministic models. In: Schmitt, J.B. (ed.) MMB&DFT 2012. LNCS, vol. 7201, pp. 249–252. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28540-0 20
		- [BK08] Baier, C., Katoen, J.-P.: Principles of Model Checking. MIT Press (2008). ISBN 978-0-262-02649-9
	- [BT99] Brafman, R.I., Tennenholtz, M.: A near-optimal poly-time algorithm for learning a class of stochastic games. In: IJCAI, pp. 734–739 (1999)
	- [CH08] Chatterjee, K., Henzinger, T.A.: Value iteration. In: Grumberg, O., Veith, H. (eds.) 25 Years of Model Checking. LNCS, vol. 5000, pp. 107–138. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-69850- 0 7
	- [CH12] Chatterjee, K., Henzinger, T.A.: A survey of stochastic ω-regular games. J. Comput. Syst. Sci. **78**(2), 394–413 (2012)
	- [CKJ12] Calinescu, R., Kikuchi, S., Johnson, K.: Compositional reverification of probabilistic safety properties for large-scale complex IT systems. In: Calinescu, R., Garlan, D. (eds.) Monterey Workshop 2012. LNCS, vol. 7539, pp. 303–329. Springer, Heidelberg (2012). https://doi.org/10.1007/ 978-3-642-34059-8 16
	- [CKPS11] Chen, T., Kwiatkowska, M., Parker, D., Simaitis, A.: Verifying team formation protocols with probabilistic model checking. In: Leite, J., Torroni,

P., ˚Agotnes, T., Boella, G., van der Torre, L. (eds.) CLIMA 2011. LNCS (LNAI), vol. 6814, pp. 190–207. Springer, Heidelberg (2011). https://doi. org/10.1007/978-3-642-22359-4 14

	- [DHS18] D'Argenio, P.R., Hartmanns, A., Sedwards, S.: Lightweight statistical model checking in nondeterministic continuous time. In: Margaria, T., Steffen, B. (eds.) ISoLA 2018. LNCS, vol. 11245, pp. 336–353. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03421-4 22
	- [DLST15] D'Argenio, P., Legay, A., Sedwards, S., Traonouez, L.-M.: Smart sampling for lightweight verification of markov decision processes. STTT **17**(4), 469–484 (2015)
		- [EGF12] Ellen, C., Gerwinn, S., Fr¨anzle, M.: Confidence bounds for statistical model checking of probabilistic hybrid systems. In: Jurdzi´nski, M., Niˇckovi´c, D. (eds.) FORMATS 2012. LNCS, vol. 7595, pp. 123–138. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33365- 1 10
			- [FT14] Fu, J., Topcu, U.: Probably approximately correct MDP learning and control with temporal logic constraints. In: Robotics: Science and Systems (2014)
	- [HAK18] Hasanbeig, M., Abate, A., Kroening, D.: Logically-correct reinforcement learning. CoRR, 1801.08099 (2018)
	- [HAK19] Hasanbeig, M., Abate, A., Kroening, D.: Certified reinforcement learning with logic guidance. CoRR, abs/1902.00778 (2019)
	- [HJB+10] He, R., Jennings, P., Basu, S., Ghosh, A.P., Wu, H.: A bounded statistical approach for model checking of unbounded until properties. In: ASE, pp. 225–234 (2010)
	- [HM17] Haddad, S., Monmege, B.: Interval iteration algorithm for MDPs and IMDPs. Theor. Comput. Sci. (2017)
	- [JLS12] Jegourel, C., Legay, A., Sedwards, S.: A platform for high performance statistical model checking – PLASMA. In: Flanagan, C., K¨onig, B. (eds.) TACAS 2012. LNCS, vol. 7214, pp. 498–503. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28756-5 37
	- [KM19] Kˇret´ınsk´y, J., Meggendorfer, T.: Of cores: a partial-exploration framework for Markov decision processes. Submitted 2019
	- [KNP11] Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585–591. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1 47
		- [Lar12] Larsen, K.G.: Statistical model checking, refinement checking, optimization,. for stochastic hybrid systems. In: Jurdzi´nski, M., Niˇckovi´c, D. (eds.) FORMATS 2012. LNCS, vol. 7595, pp. 7–10. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33365-1 2
		- [Lar13] Guldstrand Larsen, K.: Priced timed automata and statistical model checking. In: Johnsen, E.B., Petre, L. (eds.) IFM 2013. LNCS, vol. 7940, pp. 154–161. Springer, Heidelberg (2013). https://doi.org/10.1007/978- 3-642-38613-8 11
		- [Lit94] Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: ICML, pp. 157–163 (1994)
		- [LN81] Lakshmivarahan, S., Narendra, K.S.: Learning algorithms for two-person zero-sum stochastic games with incomplete information. Math. Oper. Res. **6**(3), 379–386 (1981)
	- [Put14] Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, Hoboken (2014)
	- [RF91] Raghavan, T.E.S., Filar, J.A.: Algorithms for stochastic games a survey. Z. Oper. Res. **35**(6), 437–472 (1991)
	- [RP09] El Rabih, D., Pekergin, N.: Statistical model checking using perfect simulation. In: Liu, Z., Ravn, A.P. (eds.) ATVA 2009. LNCS, vol. 5799, pp. 120–134. Springer, Heidelberg (2009). https://doi.org/10.1007/978- 3-642-04761-9 11
	- [SB98] Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
	- [SS12] Saffre, F., Simaitis, A.: Host selection through collective decision. ACM Trans. Auton. Adapt. Syst. **7**(1), 4:1–4:16 (2012)
	- [SVA04] Sen, K., Viswanathan, M., Agha, G.: Statistical model checking of blackbox probabilistic systems. In: Alur, R., Peled, D.A. (eds.) CAV 2004. LNCS, vol. 3114, pp. 202–215. Springer, Heidelberg (2004). https://doi. org/10.1007/978-3-540-27813-9 16
	- [SVA05] Sen, K., Viswanathan, M., Agha, G.: On statistical model checking of stochastic systems. In: Etessami, K., Rajamani, S.K. (eds.) CAV 2005. LNCS, vol. 3576, pp. 266–280. Springer, Heidelberg (2005). https://doi. org/10.1007/11513988 26
	- [WT16] Wen, M., Topcu, U.: Probably approximately correct learning in stochastic games with temporal logic specifications. In: IJCAI, pp. 3630–3636 (2016)
	- [YCZ10] Younes, H.L.S., Clarke, E.M., Zuliani, P.: Statistical verification of probabilistic properties with unbounded until. In: Davies, J., Silva, L., Simao, A. (eds.) SBMF 2010. LNCS, vol. 6527, pp. 144–160. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19829-8 10
	- [YS02a] Younes, H.L.S., Simmons, R.G.: Probabilistic verification of discrete event systems using acceptance sampling. In: Brinksma, E., Larsen, K.G. (eds.) CAV 2002. LNCS, vol. 2404, pp. 223–235. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45657-0 17
	- [ZPC10] Zuliani, P., Platzer, A., Clarke, E.M.: Bayesian statistical model checking with application to simulink/stateflow verification. In: HSCC, pp. 243– 252 (2010)

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Symbolic Monitoring Against Specifications Parametric in Time and Data**

Masaki Waga1,2,3(B) , Etienne Andr´ ´ <sup>e</sup>1,4,5 , and Ichiro Hasuo1,2

<sup>1</sup> National Institute of Informatics, Tokyo, Japan mwaga@nii.ac.jp

<sup>2</sup> SOKENDAI (The Graduate University for Advanced Studies), Tokyo, Japan <sup>3</sup> JSPS Research Fellow, Tokyo, Japan

<sup>4</sup> Universit´e Paris 13, LIPN, CNRS, UMR 7030, 93430 Villetaneuse, France <sup>5</sup> JFLI, CNRS, Tokyo, Japan

**Abstract.** Monitoring consists in deciding whether a log meets a given specification. In this work, we propose an automata-based formalism to monitor logs in the form of actions associated with time stamps and arbitrarily data values over infinite domains. Our formalism uses both timing parameters and data parameters, and is able to output answers symbolic in these parameters and in the log segments where the property is satisfied or violated. We implemented our approach in an ad-hoc prototype SyMon, and experiments show that its high expressive power still allows for efficient online monitoring.

#### **1 Introduction**

Monitoring consists in checking whether a sequence of data (a log or a signal) satisfies or violates a specification expressed using some formalism. Offline monitoring consists in performing this analysis after the system execution, as the technique has access to the entire log in order to decide whether the specification is violated. In contrast, online monitoring can make a decision earlier, ideally as soon as a witness of the violation of the specification is encountered.

Using existing formalisms (e.g., the metric first order temporal logic [14]), one can check whether a given bank customer withdraws more than 1,000 e every week. With formalisms extended with data, one may even *identify* such customers. Or, using an extension of the signal temporal logic (STL) [18], one can ask: "is that true that the value of variable x is always copied to y exactly 4 time units later?" However, questions relating time and data using parameters become

This work is partially supported by JST ERATO HASUO Metamathematics for Systems Design Project (No. JPMJER1603), by JSPS Grants-in-Aid No. 15KT0012 & 18J22498 and by the ANR national research program PACS (ANR-14-CE28-0002).

much harder (or even impossible) to express using existing formalisms: "what are the users and time frames during which a user withdraws more than half of the total bank withdrawals within seven days?" And even, can we *synthesize* the durations (not necessarily 7 days) for which this specification holds? Or "what is the set of variables for which there exists a duration within which their value is always copied to another variable?" In addition, detecting periodic behaviors without knowing the period can be hard to achieve using existing formalisms.

In this work, we address the challenging problem to monitor logs enriched with both timing information and (infinite domain) data. In addition, we significantly push the existing limits of expressiveness so as to allow for a further level of abstraction using *parameters*: our specification can be both parametric in the *time* and in the *data*. The answer to this symbolic monitoring is richer than a pure Boolean answer, as it *synthesizes* the values of both time and data parameters for which the specification holds. This allows us notably to detect periodic behaviors without knowing the period while being symbolic in terms of data. For example, we can *synthesize variable names* (data) and *delays* for which variables will have their value copied to another data within the aforementioned delay. In addition, we show that we can detect the log *segments* (start and end date) for which a specification holds.

*Example 1.* Consider a system updating three variables a, b and c (i. e., strings) to values (rationals). An example of log is given in Fig. 1a. Although our work is event-based, we can give a graphical representation similar to that of signals in Fig. 1b. Consider the following property: "for any variable px, whenever an update of that variable occurs, then within strictly less than tp time units, the value of variable b must be equal to that update". The *variable parameter* px is compared with string values and the *timing parameter* tp is used in the timing constraints. We are interested in checking for which values of px and tp this property is violated. This can be seen as a synthesis problem in both the variable and timing parameters. For example, px = c and tp = 1.5 is a violation of the specification, as the update of c to 2 at time 4 is not propagated to b within 1.5 time unit. Our algorithm outputs such violation by a constraint e.g., px = c ∧ tp ≤ 2. In contrast, the value of any signal at any time is always such that either b is equal to that signal, or the value of b will be equal to that value within at most 2 time units. Thus, the specification holds for any valuation of the variable parameter px, provided tp > 2.

We propose an automata-based approach to perform monitoring parametric in both time and data. We implement our work in a prototype SyMon and perform experiments showing that, while our formalism allows for high expressiveness, it is also tractable even for online monitoring.

We believe our framework balances expressiveness and monitoring performance well: *(i)* Regarding expressiveness, comparison with the existing work is summarized in Table 1 (see Sect. 2 for further details). *(ii)* Our monitoring is *complete*, in the sense that it returns a symbolic constraint characterizing *all* the parameter valuations that match a given specification. *(iii)* We also achieve

**Table 1.** Comparison of monitoring expressiveness

**Fig. 1.** Monitoring copy to b within tp time units

reasonable monitoring speed, especially given the degree of parametrization in our formalism. Note that it is not easy to formally claim superiority in expressiveness: proofs would require arguments such as the pumping lemma; and such formal comparison does not seem to be a concern of the existing work. Moreover, such formal comparison bears little importance for industrial practitioners: expressivity via an elaborate encoding is hardly of practical use. We also note that, in the existing work, we often observe gaps between the formalism in a theory and the formalism that the resulting tool actually accepts. This is not the case with the current framework.

*Outline.* After discussing related works in Sect. 2, we introduce the necessary preliminaries in Sect. 3, and our parametric timed data automata in Sect. 4. We present our symbolic monitoring approach in Sect. 5 and conduct experiments in Sect. 6. We conclude in Sect. 7.

#### **2 Related Works**

*Robustness and Monitoring.* Robust (or quantitative) monitoring extends the binary question whether a log satisfies a specification by asking "by how much" the specification is satisfied. The quantification of the distance between a signal and a signal temporal logic (STL) specification has been addressed in, e.g., [20–23,25,27] (or in a slightly different setting in [5]). The distance can be understood in terms of space ("signals") or time. In [6], the distance also copes for reordering of events. In [10], the *robust pattern matching problem* is considered over signal regular expressions, by quantifying the distance between the signal regular expression specification and the *segments* of the signal. For piecewiseconstant and piecewise-linear signals, the problem can be effectively solved using a finite union of convex polyhedra. While our framework does not fit in robust monitoring, we can simulate both the robustness w.r.t. time (using timing parameters) and w.r.t. data, e.g., signal values (using data parameters).

*Monitoring with Data.* The tool MarQ [30] performs monitoring using Quantified Event Automata (QEA) [12]. This approach and ours share the automatabased framework, the ability to express some first-order properties using "events containing data" (which we encode using local variables associated with actions), and data may be quantified. However, [30] does not seem to natively support specification parametric in time; in addition, [30] does not perform complete ("symbolic") parameters synthesis, but outputs the violating entries of the log.

The metric first order temporal logic (MFOTL) allows for a high expressiveness by allowing universal and existential quantification over data—which can be seen as a way to express parameters. A monitoring algorithm is presented for a safety fragment of MFOTL in [14]. Aggregation operators are added in [13], allowing to compute *sums* or *maximums* over data. A fragment of this logics is implemented in MonPoly [15]. While these works are highly expressive, they do not natively consider timing parameters; in addition, MonPoly does not output symbolic answers, i. e., symbolic conditions on the parameters to ensure validity of the formula.

In [26], binary decision diagrams (BDDs) are used to symbolically represent the observed data in QTL. This can be seen as monitoring data against a parametric specification, with a symbolic internal encoding. However, their implementation DejaVu only outputs *concrete* answers. In contrast, we are able to provide symbolic answers (both in timing and data parameters), e.g., in the form of union of polyhedra for rationals, and unions of string constraints using equalities (=) and inequalities (=).

*Freeze Operator.* In [18], STL is extended with a freeze operator that can "remember" the value of a signal, to compare it to a later value of the same signal. This logic STL<sup>∗</sup> can express properties such as "In the initial 10 s, x copies the values of y within a delay of 4 s": **G**[0,10] ∗ (**G**[0,4]y<sup>∗</sup> = x). While the setting is somehow different (STL<sup>∗</sup> operates over signals while we operate over timed data words), the requirements such as the one above can easily be encoded in our framework. In addition, we are able to *synthesize* the delay within which the values are always copied, as in Example 1. In contrast, it is not possible to determine using STL<sup>∗</sup> which variables and which delays violate the specification.

*Monitoring with Parameters.* In [7], a log in the form of a dense-time real-valued signal is tested against a parameterized extension of STL, where parameters can be used to model uncertainty both in signal values and in timing values. The output comes in the form of a subset of the parameters space for which the formula holds on the log. In [9], the focus is only on signal parameters, with an improved efficiency by reusing techniques from the *robust* monitoring. Whereas [7,9] fit in the framework of signals and temporal logics while we fit in words and automata, our work shares similarities with [7,9] in the sense that we can express data parameters; in addition, [9] is able as in our work to exhibit the segment of the log associated with the parameters valuations for which the specification holds. A main difference however is that we can use memory and aggregation, thanks to arithmetic on variables.

In [24], the problem of *inferring* temporal logic formulae with constraints that hold in a given numerical data time series is addressed.

*Timed Pattern Matching.* A recent line of work is that of timed pattern matching, that takes as input a log and a specification, and decides *where* in the log the specification is satisfied or violated. On the one hand, a line of works considers signals, with specifications either in the form of timed regular expressions [11,31–33], or a temporal logic [34]. On the other hand, a line of works considers timed words, with specifications in the form of timed automata [4,36]. We will see that our work can also encode parametric timed pattern matching. Therefore, our work can be seen as a two-dimensional extension of both lines of works: first, we add timing parameters ([4] also considers similar timing parameters) and, second, we add data—themselves extended with parameters. That is, coming back to Example 1, [31–33,36] could only infer the segments of the log for which the property is violated for a given (fixed) variable and a given (fixed) timing parameter; while [4] could infer both the segments of the log and the timing parameter valuations, but not which variable violates the specification.

*Summary.* We compare related works in Table 1. "Timing parameters" denote the ability to synthesize unknown constants used in timing constraints (e.g., modalities intervals, or clock constraints). "**?**" denotes works not natively supporting this, although it might be encoded. The term "Data" refers to the ability to manage logs over infinite domains (apart from timestamps). For example, the log in Fig. 1a features, beyond timestamps, both string (variable name) and rationals (value). Also, works based on real-valued signals are naturally able to manage (at least one type of) data. "Parametric data" refer to the ability to express formulas where data (including signal values) are compared to (quantified or unquantified) variables or unknown parameters; for example, in the log in Fig. 1a, an example of property parametric in data is to synthesize the parameters for which the difference of values between two consecutive updates of variable px is always below pv, where px is a string parameter and pv a rationalvalued parameter. "Memory" is the ability to remember *past* data; this can be achieved using e.g., the freeze operator of STL∗, or variables (e.g., in [14,26,30]). "Aggregation" is the ability to aggregate data using operators such as sum or maximum; this allows to express properties such as "A user must not withdraw more than \$10,000 within a 31 day period" [13]. This can be supported using dedicated aggregation operators [13] or using variables ([30], and our work). "Complete parameter identification" denotes the *synthesis* of the set of parameters that satisfy or violate the property. Here, "N/A" denotes the absence of parameter [18], or when parameters are used in a way (existentially or universally quantified) such as the identification is not explicit (instead, the position of the log where the property is violated is returned [26]). In contrast, we return in a *symbolic* manner (as in [4,7]) the exact set of (data and timing) parameters for which a property is satisfied. "√**/**×" denotes "yes" in the theory paper, but not in the tool.

#### **3 Preliminaries**

**Clocks, Timing Parameters and Timed Guards.** We assume a set C = {c1,...,c<sup>H</sup>} of *clocks*, i. e., real-valued variables that evolve at the same rate. A *clock valuation* is <sup>ν</sup> : <sup>C</sup> <sup>→</sup> <sup>R</sup>≥<sup>0</sup>. We write **<sup>0</sup>** for the clock valuation assigning 0 to all clocks. Given <sup>d</sup> <sup>∈</sup> <sup>R</sup>≥<sup>0</sup>, <sup>ν</sup> <sup>+</sup> <sup>d</sup> is s.t. (<sup>ν</sup> <sup>+</sup> <sup>d</sup>)(c) = <sup>ν</sup>(c) + <sup>d</sup>, for all <sup>c</sup> <sup>∈</sup> <sup>C</sup>. Given <sup>R</sup> <sup>⊆</sup> <sup>C</sup>, we define the *reset* of a valuation <sup>ν</sup>, denoted by [ν]R, as follows: [ν]R(c) = 0 if c ∈ R, and [ν]R(c) = ν(c) otherwise.

We assume a set TP <sup>=</sup> {tp1,...,tp<sup>J</sup> } of *timing parameters*. A *timing parameter valuation* is <sup>γ</sup> : TP <sup>→</sup> <sup>Q</sup>+. We assume ∈ {<, <sup>≤</sup>, <sup>=</sup>, <sup>≥</sup>, >}. A *timed guard* tg is a constraint over <sup>C</sup> <sup>∪</sup> TP defined by a conjunction of inequalities of the form c d, or c tp with <sup>d</sup> <sup>∈</sup> <sup>N</sup> and tp <sup>∈</sup> TP. Given tg, we write <sup>ν</sup> <sup>|</sup><sup>=</sup> <sup>γ</sup>(tg) if the expression obtained by replacing each c with ν(c) and each tp with γ(tp) in tg evaluates to true.

**Variables, Data Parameters and Data Guards.** For sake of simplicity, we assume a *single* infinite domain D for data. The formalism defined in Sect. 4 can be extended in a straightforward manner to different domains for different variables (and our implementation does allow for different types). The case of *finite* data domain is immediate too. We define this formalism in an *abstract* manner, so as to allow a sort of parameterized domain.

We assume a set <sup>V</sup> <sup>=</sup> {v1,...,v<sup>M</sup>} of *variables* valued over <sup>D</sup>. These variables are internal variables, that allow an high expressive power in our framework, as they can be compared or updated to other variables or parameters. We also assume a set LV <sup>=</sup> {*lv* <sup>1</sup>,..., *lv* <sup>O</sup>} of *local variables* valued over <sup>D</sup>. These variables will only be used locally along a transition in the "argument" of the action (e.g., x and v in upate(x, v)), and in the associated guard and (right-hand part of) updates. We assume a set VP <sup>=</sup> {vp1,..., vp<sup>N</sup> } of *data parameters*, i. e., unknown variable constants.

<sup>A</sup> *data type* (D, DE, DU) is made of *(i)* an infinite domain <sup>D</sup>, *(ii)* a set of admissible Boolean expressions DE (that may rely on <sup>V</sup>, LV and VP), which will define the type of guards over variables in our subsequent automata, and *(iii)* a domain for updates DU (that may rely on <sup>V</sup>, LV and VP), which will define the type of updates of variables in our subsequent automata.

*Example 2.* As a first example, let us define the data type for rationals. We have D = Q. Let us define Boolean expressions. A *rational comparison* is a constraint over <sup>V</sup> <sup>∪</sup> LV <sup>∪</sup> VP defined by a conjunction of inequalities of the form v d, v v , or v vp with v, v <sup>∈</sup> <sup>V</sup> <sup>∪</sup> LV, <sup>d</sup> <sup>∈</sup> <sup>Q</sup> and vp <sup>∈</sup> VP. DE is the set of all rational comparisons over <sup>V</sup>∪LV∪VP. Let us then define updates. First, a linear arithmetic expression over <sup>V</sup> <sup>∪</sup> LV <sup>∪</sup> VP is - <sup>i</sup> <sup>α</sup>iv<sup>i</sup> <sup>+</sup> <sup>β</sup>, where <sup>v</sup><sup>i</sup> <sup>∈</sup> <sup>V</sup> <sup>∪</sup> LV <sup>∪</sup> VP and <sup>α</sup>i, β <sup>∈</sup> <sup>Q</sup>. Let LA(<sup>V</sup> <sup>∪</sup> LV <sup>∪</sup> VP) denote the set of arithmetic expressions over <sup>V</sup>, LV and VP. We then have DU <sup>=</sup> LA(<sup>V</sup> <sup>∪</sup> LV <sup>∪</sup> VP).

As a second example, let us define the data type for strings. We have D = S, where S denotes the set of all strings. A *string comparison* is a constraint over <sup>V</sup> <sup>∪</sup> LV <sup>∪</sup> VP defined by a conjunction of comparisons of the form <sup>v</sup> <sup>≈</sup> <sup>s</sup>, <sup>v</sup> <sup>≈</sup> <sup>v</sup> , or <sup>v</sup> <sup>≈</sup> vp with v, v <sup>∈</sup> <sup>V</sup> <sup>∪</sup> LV, <sup>s</sup> <sup>∈</sup> <sup>S</sup>, vp <sup>∈</sup> VP and ≈∈{=, =}. DE is the set of all string comparisons over <sup>V</sup>∪LV∪VP. DU <sup>=</sup> <sup>V</sup>∪LV∪S, i. e., a string variable can be assigned another string variable, or a concrete string.

<sup>A</sup> *variable valuation* is <sup>μ</sup> : <sup>V</sup> <sup>→</sup> <sup>D</sup>. A *local variable valuation* is a partial function η : LV -<sup>D</sup>. A *data parameter valuation* is <sup>ζ</sup> : VP <sup>→</sup> <sup>D</sup>. Given a data guard dg ∈ DE, a variable valuation μ, a local variable valuation η defined for the local variables in dg, and a data parameter valuation ζ, we write (μ, η) |= ζ(dg) if the expression obtained by replacing within dg all occurrences of each data parameter vp<sup>i</sup> by ζ(vpi) and all occurrences of each variable v<sup>j</sup> (resp. local variable *lv* <sup>k</sup>) with its concrete valuation μ(v<sup>j</sup> ) (resp. η(*lv* <sup>k</sup>))) evaluates to true.

A parametric data update is a partial function PDU : V - DU. That is, we can assign to a variable an expression over data parameters and other variables, according to the data type. Given a parametric data update PDU, a variable valuation μ, a local variable valuation η (defined for all local variables appearing in PDU), and a data parameter valuation <sup>ζ</sup>, we define [μ]<sup>η</sup>(ζ(PDU)) : <sup>V</sup> <sup>→</sup> <sup>D</sup> as:

$$(\mu]\_{\eta(\zeta(\mathsf{PDU}))}(v) = \begin{cases} \mu(v) & \text{if } \mathsf{PDU}(v) \text{ is undefined} \\ \eta(\mu(\zeta(\mathsf{PDU}(v)))) & \text{otherwise} \end{cases}$$

where η(μ(ζ(PDU(v)))) denotes the replacement within the update expression PDU(v) of all occurrences of each data parameter vp<sup>i</sup> by ζ(vpi), and all occur-


**Table 2.** Variables, parameters and valuations used in guards

rences of each variable v<sup>j</sup> (resp. local variable *lv* <sup>k</sup>) with its concrete valuation μ(v<sup>j</sup> ) (resp. η(*lv* <sup>k</sup>)). Observe that this replacement gives a value in D, therefore the result of [μ]η(ζ(PDU)) is indeed a data parameter valuation <sup>V</sup> <sup>→</sup> <sup>D</sup>. That is, [μ]η(ζ(PDU)) computes the new (non-parametric) variable valuation obtained after applying to μ the partial function PDU valuated with ζ.

*Example 3.* Consider the data type for rationals, the variables set {v1, v2}, the local variables set {*lv* <sup>1</sup>, *lv* <sup>2</sup>} and the parameters set {vp1}. Let μ be the variable valuation such that μ(v1) = 1 and μ(v2) = 2, and η be the local variable valuation such that η(*lv* <sup>1</sup>) = 2 and η(*lv* <sup>2</sup>) is not defined. Let ζ be the data parameter valuation such that ζ(vp1) = 1. Consider the parametric data update function PDU such that PDU(v1)=2×v<sup>1</sup> +v<sup>2</sup> −*lv* <sup>1</sup> +vp1, and PDU(v2) is undefined. Then the result of [μ]<sup>η</sup>(ζ(PDU)) is μ such that μ (v1)=2×μ(v1)+μ(v2)−η(*lv* <sup>1</sup>)+ζ(vp1)=3 and μ (v2) = 2.

#### **4 Parametric Timed Data Automata**

We introduce here Parametric timed data automata (PTDAs). They can be seen as an extension of parametric timed automata [2] (that extend timed automata [1] with parameters in place of integer constants) with unbounded data variables and parametric variables. PTDAs can also be seen as an extension of some extensions of timed automata with data (see e.g., [16,19,29]), that we again extend with both data parameters and timing parameters. Or as an extension of quantified event automata [12] with explicit time representation using clocks, and further augmented with timing parameters. PTDAs feature both timed guards and data guards; we summarize the various variables and parameters types together with their notations in Table 2.

We will associate local variables with actions (which can be see as *predicates*). Let *Dom* : <sup>Σ</sup> <sup>→</sup> <sup>2</sup>LV denote the set of local variables associated with each action. Let *Var* (dg) (resp. *Var* (PDU)) denote the set of variables occurring in dg (resp. PDU).

**Definition 1 (PTDA).** *Given a data type* (D, DE, DU)*, a parametric timed data automaton (PTDA)* <sup>A</sup> *over this data type is a tuple* <sup>A</sup> = (Σ, L, 0, F, <sup>C</sup>, TP, V,LV, μ0, VP, E)*, where:*


*9.* E *is a finite set of edges* e = (, tg, dg, a, R, PDU, ) *where (i)* , ∈ L *are the source and target locations, (ii)* tg *is a timed guard, (iii)* dg ∈ DE *is a data guard such as Var* (dg) <sup>∩</sup> LV <sup>⊆</sup> *Dom*(a)*, (iv)* <sup>a</sup> <sup>∈</sup> <sup>Σ</sup>*, (v)* <sup>R</sup> <sup>⊆</sup> <sup>C</sup> *is a set of clocks to be reset, and (vi)* PDU : V - DU *is the parametric data update function such that Var* (PDU) <sup>∩</sup> LV <sup>⊆</sup> *Dom*(a)*.*

The domain conditions on dg and PDU ensure that the local variables used in the guard (resp. update) are only those in the action signature *Dom*(a).

**Fig. 2.** Monitoring proper file opening and closing

*Example 4.* Consider the PTDA in Fig. 2b over the data type for strings. We have <sup>C</sup> <sup>=</sup> {c}, TP <sup>=</sup> {tp}, <sup>V</sup> <sup>=</sup> <sup>∅</sup> and LV <sup>=</sup> {f,m}. *Dom*(open) = {f,m} while *Dom*(close) = {f}. <sup>2</sup> is the only accepting location, modeling the violation of the specification.

This PTDA (freely inspired by a formula from [26] further extended with timing parameters) monitors the improper file opening and closing, i. e., a file already open should not be open again, and a file that is open should not be closed too late. The data parameter vp is used to *symbolically* monitor a given file name, i. e., we are interested in opening and closings of this file only, while other files are disregarded (specified using the self-loops in <sup>0</sup> and <sup>1</sup> with data guard f = vp). Whenever f is opened (transition from <sup>0</sup> to <sup>1</sup>), a clock c is reset. Then, in <sup>1</sup>, if f is closed within tp time units (timed guard "c ≤ tp"), then the system goes back to <sup>0</sup>. However, if instead f is opened again, this is an incorrect behavior and the system enters <sup>2</sup> via the upper transition. The same occurs if f is closed more than tp time units after opening.

Given a data parameter valuation ζ and a timing parameter valuation γ, we denote by γ|ζ(A) the resulting *timed data automaton (TDA)*, i. e., the nonparametric structure where all occurrences of a parameter vp<sup>i</sup> (resp. tp<sup>j</sup> ) have been replaced by <sup>ζ</sup>(vpi) (resp. <sup>γ</sup>(tp<sup>j</sup> )). Note that, if <sup>V</sup> <sup>=</sup> LV <sup>=</sup> <sup>∅</sup>, then <sup>A</sup> is a *parametric timed automaton* [2] and γ|ζ(A) is a *timed automaton* [1].

We now equip our TDAs with a concrete semantics.

**Definition 2 (Semantics of a TDA).** *Given a PTDA* A = (Σ, L, 0, F, <sup>C</sup>,TP, <sup>V</sup>,LV, μ0, VP, E) *over a data type* (D, DE, DU)*, a data parameter valuation* ζ *and a timing parameter valuation* γ*, the semantics of* γ|ζ(A) *is given by the timed transition system (TTS)* (S, s0,→)*, with*


Moreover we write ((, μ, ν),(e, η, d),( , μ , ν )) ∈ → for a combination of a delay and discrete transition if <sup>∃</sup>ν : (, μ, ν) <sup>d</sup> → (, μ, ν) e,η → ( , μ , ν ).

Given a TDA γ|ζ(A) with concrete semantics (S, s0,→), we refer to the states of S as the *concrete states* of γ|ζ(A). A *run* of γ|ζ(A) is an alternating sequence of concrete states of γ|ζ(A) and triples of edges, local variable valuations and delays, starting from the initial state s<sup>0</sup> of the form (<sup>0</sup>, μ0, ν0),(e0, η, d0),(<sup>1</sup>, μ1, ν1), ··· with i = 0, 1,... , e<sup>i</sup> ∈ E, <sup>d</sup><sup>i</sup> <sup>∈</sup> <sup>R</sup>≥<sup>0</sup> and ((<sup>i</sup>, μi, νi),(ei, ηi, di),(<sup>i</sup>+1, μ<sup>i</sup>+1, ν<sup>i</sup>+1)) ∈ →. Given such a run, the associated *timed data word* is (a1, τ1, η1),(a2, τ2, η2), ··· , where a<sup>i</sup> is the action of edge e<sup>i</sup>−<sup>1</sup>, η<sup>i</sup> is the local variable valuation associated with that transition, and τ<sup>i</sup> = - <sup>0</sup>≤j≤i−<sup>1</sup> <sup>d</sup><sup>j</sup> , for <sup>i</sup> = 1, <sup>2</sup> ··· . For a timed data word w and a concrete state (, μ, ν) of γ|ζ(A), we write (<sup>0</sup>, μ0, **<sup>0</sup>**) <sup>w</sup> −→ (, μ, ν) in γ|ζ(A) if w is associated with a run of γ|ζ(A) of the form (<sup>0</sup>, μ0, **0**),...,(<sup>n</sup>, μn, νn) with (<sup>n</sup>, μn, νn)=(, μ, ν). For a timed data word w = (a1, τ1, η1),(a2, τ2, η2),...,(an, τn, ηn), we denote |w| = n and for any i ∈ {1, 2,...,n}, we denote w(1, i)=(a1, τ1, η1),(a2, τ2, η2),..., (ai, τi, ηi).

A finite run is *accepting* if its last state (, μ, ν) is such that ∈ F. The *language* L(γ|ζ(A)) is defined to be the set of timed data words associated with all accepting runs of γ|ζ(A).

*Example 5.* Consider the PTDA in Fig. 2b over the data type for strings. Let γ(tp) = 100 and ζ(vp) = Hakuchi.txt. An accepting run of the TDA γ|ζ(A) is: (<sup>0</sup>, ∅, ν0),(e0, η0, 2046),(<sup>1</sup>, ∅, ν1),(e1, η1, 90),(<sup>1</sup>, ∅, ν2)(e2, η2, 30),(<sup>2</sup>, ∅, ν3), where <sup>∅</sup> denotes a variable valuation over an empty domain (recall that <sup>V</sup> <sup>=</sup> <sup>∅</sup> in Fig. 2b), ν0(c) = 0, ν1(c) = 0, ν2(c) = 90, ν3(c) = 120, e<sup>0</sup> is the upper edge from <sup>0</sup> to <sup>1</sup>, e<sup>1</sup> is the self-loop above <sup>1</sup>, e<sup>2</sup> is the lower edge from <sup>1</sup> to <sup>2</sup>, η0(f) = η2(f) = Hakuchi.txt, η1(f) = Unagi.mp4, η0(m) = η1(m) = rw, and η2(m) is undefined (because *Dom*(close) = {f}).

The associated timed data word is (open, 2046, η0),(open, 2136, η1), (close, 2166, η2).

Since each action is associated with a set of local variables, given an ordering on this set, it is possible to see a given action and a variable valuation as a predicate: for example, assuming an ordering of LV such as f precedes m, then open with η<sup>0</sup> can be represented as open(Hakuchi.txt, rw). Using this convention, the log in Fig. 2a corresponds exactly to this timed data word.

#### **5 Symbolic Monitoring Against PTDA Specifications**

In symbolic monitoring, in addition to the (observable) actions in Σ, we employ *unobservable* actions denoted by ε and satisfying *Dom*(ε) = ∅. We write Σ<sup>ε</sup> for Σ {ε}. We let η<sup>ε</sup> be the local variable valuation such that ηε(*lv*) is undefined for any *lv* <sup>∈</sup> LV. For a timed data word <sup>w</sup> = (a1, τ1, η1),(a2, τ2, η2),..., (an, τn, ηn) over Σε, the projection w↓<sup>Σ</sup> is the timed data word over Σ obtained from w by removing any triple (ai, τi, ηi) where a<sup>i</sup> = ε. An edge e = (, tg, dg, a, R, PDU, ) ∈ E is *unobservable* if a = ε, and *observable* otherwise. The use of unobservable actions allows us to encode parametric timed pattern matching (see Sect. 5.3).

We make the following assumption on the PTDAs in symbolic monitoring.

**Assumption 1.** *The PTDA* A *does not contain any loop of unobservable edges.*

#### **5.1 Problem Definition**

Roughly speaking, given a PTDA A and a timed data word w, the symbolic monitoring problem asks for the set of pairs (γ, ζ) <sup>∈</sup> (Q+)TP <sup>×</sup> <sup>D</sup>VP satisfying w(1, i) ∈ γ|ζ(A), where w(1, i) is a prefix of w. Since A also contains unobservable edges, we consider w which is w augmented by unobservable actions.

#### **Symbolic monitoring problem:**

Input: a PTDA <sup>A</sup> over a data type (D, DE, DU) and actions <sup>Σ</sup>ε, and a timed data word w over Σ Problem: compute all the pairs (γ, ζ) of timing and data parameter valuations such that there is a timed data word w over Σ<sup>ε</sup> and i ∈ {1, 2,..., |w |} satisfying w ↓<sup>Σ</sup> = w and w (1, i) ∈ L(γ|ζ(A)). That is, it requires the *validity domain* D(w, A) = {(γ, ζ) | ∃w : i ∈ {1, 2,..., |w |}, w ↓<sup>Σ</sup> = w and w (1, i) ∈ L(γ|ζ(A))}.

*Example 6.* Consider the PTDA A and the timed data word w shown in Fig. 1. The validity domain D(w, A) is D(w, A) = D<sup>1</sup> ∪ D2, where

$$D\_1 = \left\{ (\gamma, \zeta) \mid 0 \le \gamma(\text{tp}) \le 2, \zeta(\text{\textquotedblleft} \text{\textquotedblright}) = \text{c} \right\} \text{ and } D\_2 = \left\{ (\gamma, \zeta) \mid 0 \le \gamma(\text{tp}) \le 1, \zeta(\text{\textquotedblleft} \text{\textquotedblright}) = \text{a} \right\}.$$

For w = w(1, 3) · (ε, ηε, 2.9), we have w ∈ L(γ|ζ(A)) and w ↓<sup>Σ</sup> = w(1, 3), where γ and ζ are such that γ(tp)=1.8 and ζ(xp) = c, and w(1, 3) · (ε, ηε, 2.9) denotes the juxtaposition.

For the data types in Example 2, the validity domain D(w, A) can be represented by a constraint of finite size because the length |w| of the timed data word is finite.

#### **5.2 Online Algorithm**

Our algorithm is *online* in the sense that it outputs (γ, ζ) ∈ D(w, A) as soon as its membership is witnessed, even before reading the whole timed data word w.

Let w = (a1, τ1, η1),(a2, τ2, η2),...(an, τn, ηn) and A be the timed data word and PTDA given in symbolic monitoring, respectively. Intuitively, after reading (ai, τi, ηi), our algorithm symbolically computes for all parameter valuations (γ, ζ) <sup>∈</sup> (Q+)TP <sup>×</sup> <sup>D</sup>VP the concrete states (, ν, μ) satisfying (<sup>0</sup>, μ0, **<sup>0</sup>**) <sup>w</sup>(1,i) −−−−→ (, μ, ν) in γ|ζ(A). Since A has unobservable edges as well as observable edges, we have to add unobservable actions before or after observable actions in w. By *Conf* <sup>o</sup> <sup>i</sup> , we denote the configurations after reading (ai, τi, ηi) and no unobservable actions are appended after (ai, τi, ηi). By *Conf* <sup>u</sup> <sup>i</sup> , we denote the configurations after reading (ai, τi, ηi) and at least one unobservable action is appended after (ai, τi, ηi).

**Definition 3** (*Conf* <sup>o</sup> <sup>i</sup> , *Conf* <sup>u</sup> <sup>i</sup> ). *For a PTDA* A *over actions* Σε*, a timed data word* <sup>w</sup> *over* <sup>Σ</sup>*, and* <sup>i</sup> ∈ {0, <sup>1</sup>,..., <sup>|</sup>w|} *(resp.* <sup>i</sup> ∈ {−1, <sup>0</sup>,..., <sup>|</sup>w|}*), Conf* <sup>o</sup> i *(resp. Conf* <sup>u</sup> <sup>i</sup> *) is the set of 5-tuples* (, ν, γ, μ, ζ) *such that there is a timed data word* <sup>w</sup> *over* <sup>Σ</sup><sup>ε</sup> *satisfying the following: (i)* (<sup>0</sup>, μ0, **<sup>0</sup>**) <sup>w</sup>- −→ (, μ, ν) *in* γ|ζ(A)*, (ii)* w ↓<sup>Σ</sup> = w(1, i)*, (iii) The last action* a |w-<sup>|</sup> *of* <sup>w</sup> *is observable (resp. unobservable and its timestamp is less than* τ<sup>i</sup>+1*).*


Algorithm 1 shows an outline of our algorithm for symbolic monitoring (see [35] for the full version). Our algorithm incrementally computes *Conf* <sup>u</sup> <sup>i</sup>−<sup>1</sup> and *Conf* <sup>o</sup> <sup>i</sup> (line 3). After reading (ai, τi, ηi), our algorithm stores the partial results (γ, ζ) <sup>∈</sup> <sup>D</sup>(w, <sup>A</sup>) witnessed from the accepting configurations in *Conf* <sup>u</sup> <sup>i</sup>−<sup>1</sup> and *Conf* <sup>o</sup> <sup>i</sup> (line 4). (We also need to try to take potential unobservable transitions and store the results from the accepting configurations *after* the last element of the timed data word (lines 5 and 6).)

Since (Q+)TP×DVP is an infinite set, we cannot try each (γ, ζ) <sup>∈</sup> (Q+)TP×DVP and we use a symbolic representation for parameter valuations. Similarly to the reachability synthesis of parametric timed automata [28], a set of clock and timing parameter valuations can be represented by a convex polyhedron. For variable valuations and data parameter valuations, we need an appropriate representation depending on the data type (D, DE, DU). Moreover, for the termination of Algorithm 1, some operations on the symbolic representation are required.

**Theorem 1 (termination).** *For any PTDA* <sup>A</sup> *over a data type* (D, DE, DU) *and actions* Σε*, and for any timed data word* w *over* Σ*, Algorithm 1 terminates if the following operations on the symbolic representation* V<sup>d</sup> *of a set of variable and data parameter valuations terminate.*


*Example 7.* For the data type for rationals in Example 2, variable and data parameter valuations V<sup>d</sup> can be represented by convex polyhedra and the above operations terminate. For the data type for strings S in Example 2, variable and data parameter valuations <sup>V</sup><sup>d</sup> can be represented by <sup>S</sup>|V<sup>|</sup> <sup>×</sup> (<sup>S</sup> ∪ Pfin(S))|VP<sup>|</sup> and the above operations terminate, where <sup>P</sup>fin(S) is the set of finite sets of <sup>S</sup>.

**Fig. 3.** PTDAs in Dominant (left) and Periodic (right)

#### **5.3 Encoding Parametric Timed Pattern Matching**

The symbolic monitoring problem is a generalization of the parametric timed pattern matching problem of [4]. Recall that parametric timed pattern matching aims at synthesizing timing parameter valuations and *start and end times in the log* for which a log segment satisfies or violates a specification. In our approach, by adding a clock measuring the absolute time, and two timing parameters encoding respectively the start and end date of the segment, one can easily infer the log segments for which the property is satisfied.

Consider the Dominant PTDA (left of Fig. 3). It is inspired by a monitoring of withdrawals from bank accounts of various users [15]. This PTDA monitors situations when a user withdraws more than half of the total withdrawals within a time window of (50, 100). The actions are Σ = {withdraw} and *Dom*(withdraw) = {n, a}, where n has a string value and a has an integer value. The string n represents a user name and the integer a represents the amount of the withdrawal by the user n. Observe that clock c is never reset, and therefore measures absolute time. The automaton can non-deterministically remain in <sup>0</sup>, or start to measure a log by taking the ε-transition to <sup>1</sup> checking c = tp1, and therefore "remembering" the start time using timing parameter tp1. Then, whenever a user vp has withdrawn more than half of the accumulated withdrawals (data guard 2v<sup>1</sup> > v2) in a (50, 100) time window (timed guard c − tp<sup>1</sup> ∈ (50, 100)), the automaton takes a ε-transition to the accepting location, checking c = tp2, and therefore remembering the end time using timing parameter tp2.

#### **6 Experiments**

We implemented our symbolic monitoring algorithm in a tool SyMon in C++, where the domain for data is the strings and the integers. Our tool SyMon is distributed at https://github.com/MasWag/symon. We use PPL [8] for the symbolic representation of the valuations. We note that we employ an optimization to merge adjacent polyhedra in the configurations if possible. We evaluated our monitor algorithm against three original benchmarks: Copy in Fig. 1c; and Dominant and Periodic in Fig. 3. We conducted experiments on an Amazon EC2 c4.large instance (2.9 GHz Intel Xeon E5-2666 v3, 2 vCPUs, and 3.75 GiB RAM) that runs Ubuntu 18.04 LTS (64 bit).

#### **6.1 Benchmark 1: Copy**

Our first benchmark Copy is a monitoring of variable updates much like the scenario in [18]. The actions are Σ = {update} and *Dom*(update) = {n, v}, where n has a string value representing the name of the updated variables and v has an integer value representing the updated value. Our set consists of 10 timed data words of length 4,000 to 40,000.

The PTDA in Copy is shown in Fig. 1c, where we give an additional constraint 3 < tp < 10 on tp. The property encoded in Fig. 1c is "for any variable px, whenever an update of that variable occurs, then within tp time units, the value of b must be equal to that update".

The experiment result is in Fig. 4. We observe that the execution time is linear to the number of the events and the memory usage is more or less constant with respect to the number of events.

#### **6.2 Benchmark 2: Dominant**

Our second benchmark is Dominant (Fig. 3 left). Our set consists of 10 timed data words of length 2,000 to 20,000. The experiment result is in Fig. 5. We observe that the execution time is linear to the number of the events and the memory usage is more or less constant with respect to the number of events.

#### **6.3 Benchmark 3: Periodic**

Our third benchmark Periodic is inspired by a parameter identification of periodic withdrawals from one bank account. The actions are Σ = {withdraw} and *Dom*(withdraw) = {a}, where a has an integer value representing the amount of the withdrawal. We randomly generated a set consisting of 10 timed data words of length 2,000 to 20,000. Each timed data word consists of the following three kinds of periodic withdrawals:

**shortperiod** One withdrawal occurs every 5 ± 1 time units. The amount of the withdrawal is 50 ± 3.

**middleperiod** One withdrawal occurs every 50 ± 3 time units. The amount of the withdrawal is 1000 ± 40.

**longperiod** One withdrawal occurs every 100 ± 5 time units. The amount of the withdrawal is 5000 ± 20.

The PTDA in Periodic is shown in the right of Fig. 3. The PTDA matches situations where, for any two successive withdrawals of amount more than vp, the duration between them is within [tp1,tp2]. By the symbolic monitoring, one can identify the period of the

periodic withdrawals of amount greater than vp is in [tp1,tp2]. An example of the validity domain is shown in the right figure.

The experiment result is in Fig. 5. We observe that the execution time is linear to the number of the events and the memory usage is more or less constant with respect to the number of events.

#### **6.4 Discussion**

First, a positive result is that our algorithm effectively performs symbolic monitoring on more than 10,000 actions in one or two minutes even though the PTDAs feature both timing and data parameters. The execution time in Copy is 50–100 times smaller than that in Dominant and Periodic. This is because the constraint 3 < tp < 10 in Copy is strict and the size of the configurations (i. e., *Conf* <sup>o</sup> <sup>i</sup> and *Conf* <sup>u</sup> <sup>i</sup> in Algorithm 1) is small. Another positive result is that in all of the benchmarks, the execution time is linear and the memory usage is more or less constant in the size of the input word. This is because the size of configurations (i. e., *Conf* <sup>o</sup> <sup>i</sup> and *Conf* <sup>u</sup> <sup>i</sup> in Algorithm 1) is bounded due to the following reason. In Dominant, the loop in <sup>1</sup> of the PTDA is deterministic, and because of the guard c − tp<sup>1</sup> ∈ (50, 100) in the edge from <sup>1</sup> to <sup>2</sup>, the number of the loop edges at <sup>1</sup> in an accepting run is bounded (if the duration between two continuing actions are bounded as in the current setting). Therefore, <sup>|</sup>*Conf* <sup>o</sup> i | and <sup>|</sup>*Conf* <sup>u</sup> <sup>i</sup> <sup>|</sup> in Algorithm <sup>1</sup> are bounded. The reason is similar in Copy, too. In Periodic, since the PTDA is deterministic and the valuations of the amount of the withdrawals are in finite number, <sup>|</sup>*Conf* <sup>o</sup> <sup>i</sup> <sup>|</sup> and <sup>|</sup>*Conf* <sup>u</sup> <sup>i</sup> | in Algorithm 1 are bounded.

It is clear that we can design ad-hoc automata for which the execution time of symbolic monitoring can grow much faster (e.g., exponential in the size of input word). However, experiments showed that our algorithm monitors various interesting properties in a reasonable time.

Copy and Dominant use data and timing parameters as well as memory and aggregation; from Table 1, no other monitoring tool can compute the valuations satisfying the specification. We however used the parametric timed model checker IMITATOR [3] to try to perform such a synthesis, by encoding the input log as a separate automaton; but IMITATOR ran out of memory (on a 3.75 GiB RAM computer) for Dominant with <sup>|</sup>w<sup>|</sup> = 2000, while SyMon terminates in 14 s with only 6.9 MiB for the same benchmark. Concerning Periodic, the only existing work that can possibly accommodate this specification is [7]. While the precise performance comparison is interesting future work (their implementation is not publicly available), we do not expect our implementation be vastly outperformed: in [7], their tool times out (after 10 min) for a simple specification ("**E**[0,s2]**G**[0,s1](x<p)") and a signal discretized by only 128 points.

For those problem instances which MonPoly and DejaVu can accommodate (which are simpler and less parametrized than our benchmarks), they tend to run much faster than ours. For example, in [26], it is reported that they can process a trace of length 1,100,004 in 30.3 s. The trade-off here is expressivity: for example, DejaVu does not seem to accommodate Dominant, because DejaVu does not allow for aggregation. We also note that, while SyMon can be slower than MonPoly and DejaVu, it is fast enough for many scenarios of real-world online monitoring.

#### **7 Conclusion and Perspectives**

We proposed a symbolic framework for monitoring using parameters both in data and time. Logs can use timestamps and infinite domain data, while our monitor automata can use timing and variable parameters (in addition to clocks and local variables). In addition, our online algorithm can answer symbolically, by outputting all valuations (and possibly log segments) for which the specification is satisfied or violated. We implemented our approach into a prototype SyMon and experiments showed that our tool can effectively monitor logs of dozens of thousands of events in a short time.

*Perspectives.* Combining the BDDs used in [26] with some of our data types (typically strings) could improve our approach by making it even more symbolic. Also, taking advantage of the polarity of some parameters (typically the timing parameters, in the line of [17]) could improve further the efficiency.

We considered *infinite* domains, but the case of *finite* domains raises interesting questions concerning result representation: if the answer to a property is "neither a nor b", knowing the domain is {a, b, c}, then the answer should be c.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **STAMINA: STochastic Approximate Model-Checker for INfinite-State Analysis**

Thakur Neupane1(B) , Chris J. Myers2 , Curtis Madsen3 , Hao Zheng4 , and Zhen Zhang1

> <sup>1</sup> Utah State University, Logan, UT, USA thakur.neupane@aggiemail.usu.edu zhen.zhang@usu.edu <sup>2</sup> University of Utah, Salt Lake City, UT, USA myers@ece.utah.edu <sup>3</sup> Boston University, Boston, MA, USA ckmadsen@bu.edu <sup>4</sup> University of South Florida, Tampa, FL, USA haozheng@usf.edu

**Abstract.** Stochastic model checking is a technique for analyzing systems that possess probabilistic characteristics. However, its scalability is limited as probabilistic models of real-world applications typically have very large or infinite state space. This paper presents a new infinite state CTMC model checker, STAMINA, with improved scalability. It uses a novel state space approximation method to reduce large and possibly infinite state CTMC models to finite state representations that are amenable to existing stochastic model checkers. It is integrated with a new property-guided state expansion approach that improves the analysis accuracy. Demonstration of the tool on several benchmark examples shows promising results in terms of analysis efficiency and accuracy compared with a state-of-theart CTMC model checker that deploys a similar approximation method.

**Keywords:** Stochastic model checking · Infinite-state · Markov chains

#### **1 Introduction**

Stochastic model checking is a formal method that designers and engineers can use to determine the likelihood of *safety* and *liveness* properties. Checking properties using numerical model checking techniques requires enumerating the state space of the system to determine the probability that the system is in any given state at a desired time [17]. Real-world applications often have very large or even infinite state spaces.

Numerous state representation, reduction, and approximation methods have been proposed. Symbolic model checking based on *multi-terminal binary decision diagrams* (MTBDDs) [23] has achieved success in representing large *Markov Decision Process* (MDP) models with a few distinct probabilistic choices at each state, e.g., the shared coin protocol [3]. MTBDDs, however, are often inefficient for models with many different and distinct probability/rate values due to the inefficient representation of solution vectors. *Continuous-time Markov chain* (CTMC) models, whose state transition rate is a function of state variables, generally contain many distinct rate values. As a result, symbolic model checkers can run out of memory while verifying a typical CTMC model with as few as 73,000 states [23]. State reduction techniques, such as bisimulation minimization [7,8,14], abstraction [6,12,14,20], symmetry reduction [5,16], and partial order reduction [9] have been mainly extended to discrete-time, finite-state probabilistic systems. The three-valued abstraction [14] can reduce large, finite-state CTMCs. It may, however, provide inconclusive verification results due to abstraction.

To the best of our knowledge, only a few tools can analyze infinite-state probabilistic models, namely, STAR [19] and INFAMY [10]. The STAR tool primarily analyzes biochemical reaction networks. It approximates solutions to the *chemical master equation* (CME) using the *method of conditional moments* (MCM) [11] that combines momentbased and state-based representations of probability distributions. This hybrid approach represents species with low concentrations using a discrete stochastic description and numerically integrates a small master equation using the fourth order Runge-Kutta method over a small time interval [2]; and solves a system of conditional moment equations for higher concentration species, conditioned on the low concentration species. This method has been optimized to drop unlikely states and add likely states on-the-fly. STAR relies on a well-structured underlying Markov process with small sensitivity on the transient distribution. Also, it mainly reports state reachability probabilities, instead of checking a given probabilistic property. INFAMY is a truncation-based approach that explores the model's state space up to a certain finite depth k. The truncated state space still grows exponentially with respect to exploration depth. Starting from the initial state, breadth-first state search is performed up to a certain finite depth. The error probability computed during the model checking depends on the depth of state exploration. Therefore, higher exploration depth generally incurs lower error probability.

This paper presents a new infinite-state stochastic model checker, *STochastic Approximate Model-checker for INfinite-state Analysis*(STAMINA). Our tool also takes a truncation-based approach. In particular, it maintains a probability estimate of each path being explored in the state space, and when the currently explored path probability drops below a specified threshold, it halts exploration of this path. All transitions exiting this state are redirected to an absorbing state. After all paths have been explored or truncated, transient Markov chain analysis is applied to determine the probability of a transient property of interest specified using *Continuous Stochastic Logic* (CSL) [4]. The calculated probability forms a lower bound on the probability, while the upper bound also includes the probability of the absorbing state. The actual probability of the CSL property is guaranteed to be within this range. An initial version of our tool and preliminary results are reported in [22]. Since that paper, our tool has been tightly integrated within the PRISM model checker [18] to improve performance, and we have also developed a new property-guided state expansion technique to expand the state space to tighten the reported probability range incrementally. This paper reports our results, which show significant improvement on both efficiency and verification accuracy over several non-trivial case studies from various application domains.

#### **2 STAMINA**

Figure 1 presents the architecture of STAMINA. Based on a user-specified probability threshold <sup>κ</sup> (kappa), it first constructs a finite-state CTMC model Cκ from the original infinite-state CTMC model C using the state space approximation method presented in Sect. 2.1. Cκ is then checked using the PRISM explicit-state model checker against a given CSL property <sup>P</sup>∼p(φ), where <sup>∼</sup>∈ {<, >, -, } and p ∈ [0, 1] (for cases where it is desired that a predicate be true within a certain probability bound) or P=?(φ) (for cases where it is desired that the exact probability of the predicate being true be calculated). Lower- and upper-bound probabilities that φ holds, namely, Pmin and Pmax, are then obtained, and their difference, i.e., (Pmax − Pmin), is the probability accumulated in the absorbing state **x**abs which abstracts all the states not included in the current state space. If p ∈ [Pmin, Pmax], it is not known whether P<sup>∼</sup>p(φ) holds. If exact probability is of interest and the probability range is larger than the user-defined precision , i.e., (Pmax − Pmin) > , then the method does not give a meaningful result.

**Fig. 1.** Architecture of STAMINA.

For an inconclusive verification result from the previous step, STAMINA applies a property-guided approach, described in Sect. 2.2, to further expand C<sup>κ</sup>, provided P<sup>∼</sup>p(φ) is a non-nested "until" formula; otherwise, it uses the previous method to expand the state space. Note that κ also drops by the reduction factor κ<sup>r</sup> to enable states that were previously ignored due to a low probability estimate to be included in the current state expansion. The expanded CTMC model C<sup>κ</sup> is then checked to obtain a new probability bound [Pmin, Pmax]. This iterative process repeats until one of the following conditions holds: (1) the target probability p falls outside the probability bound [Pmin, Pmax], (2) the probability bound is sufficiently small, i.e, (Pmax − Pmin) < , or (3) a maximal number of iterations N has been reached (r N).

#### **2.1 State Space Approximation**

The state space approximation method [22] truncates the state space based on a userspecified reachability threshold κ. During state exploration, the reachability-value function, <sup>κ</sup><sup>ˆ</sup> : **<sup>X</sup>** <sup>→</sup> <sup>R</sup><sup>+</sup>, estimates the probability of reaching a state on-the-fly, and is compared against κ to determine whether the state search should terminate. Only states with a higher reachability-value than the reachability threshold are explored further.

Figure 2 illustrates the standard *breadth first search* (BFS) state exploration for reachability threshold κ = 0.25. It starts from the initial state whose reachability-value i.e., κˆ(**x**0), is initialized to 1.0 as shown in Fig. 2a. In the first step, two new states **x**<sup>1</sup> and **x**<sup>4</sup> are generated and their reachability-values are 0.8 and 0.2, respectively, as shown in Fig. 2b. The reachability-value in **x**<sup>0</sup> is distributed to its successor states, based on the probability of outgoing transitions from **x**<sup>0</sup> to its successor state. For the next step, only state **<sup>x</sup>**<sup>1</sup> is scheduled for exploration because <sup>κ</sup>ˆ(**x**1) <sup>≥</sup> <sup>κ</sup>. Note that the transition from **x**<sup>4</sup> to **x**<sup>0</sup> is executed because **x**<sup>0</sup> is already in the explored set. Expanding **x**<sup>1</sup> leads to two new states, namely **x**<sup>2</sup> and **x**<sup>5</sup> as shown in Fig. 2c, from which only **x**<sup>5</sup> is scheduled for further exploration. This leads to the generation of **x**<sup>6</sup> and **x**<sup>9</sup> shown in Fig. 2d. State exploration terminates after Fig. 2e since both newly generated states have reachability-values less than 0.25. States **x**2, **x**4, **x**<sup>6</sup> and **x**<sup>9</sup> are marked as terminal states. During state exploration, the reachability-value update is performed every time a new incoming path is added to a state because a new incoming path can add its contribution to the state, potentially bringing the reachability-value above κ, which in turn changes a terminal state to be non-terminal. When the truncated CTMC model C<sup>κ</sup> is analyzed, it introduces some error in the probability value of the property under verification, because of leakage the probability (i.e., cumulative path probabilities of reaching states not included in the explored state space) during the CTMC analysis. To

**Fig. 2.** State space approximation.

account for probability loss, an abstract absorbing state **x**abs is created as the sole successor state for all terminal states on each truncated path. Figure 2e shows the addition of the absorbing state.

#### **2.2 Property Based State Space Exploration**

This paper introduces a property-guided state expansion method, in order to efficiently obtain a tightened probability bound. Since all non-nested CSL path formulas φ (except those containing the "next" operator) derive from the "until" formula, <sup>Φ</sup> <sup>U</sup><sup>I</sup> <sup>Ψ</sup>, construction of the set of terminal states for further expansion boils down to eliminating states that are known to satisfy or dissatisfy Φ U Ψ. Given a state graph, a path starting from the initial state can never satisfy Φ U Ψ, if it includes a state satisfying ¬Φ ∧ ¬Ψ. Also, if a path includes a state satisfying Ψ, satisfiability of Φ U Ψ can be determined without further expanding this path beyond the first Ψ-state. Our property-guided state space expansion method identifies the path prefixes, from which satisfiability of Φ U Ψ can be determined, and shortens them by making the last state of each prefix absorbing based on the satisfiability of (¬Φ∨Ψ). Only the non-absorbing states whose path probability is greater than the state probability estimate threshold κ are expanded further. For detailed algorithms of STAMINA, readers are encouraged to read [21].

#### **3 Results**

This section presents results on the following case studies to illustrate the accuracy and efficiency of STAMINA: a genetic toggle switch [20,22]; the following examples from the PRISM benchmark suite [15]: grid world robot, cyclic server polling system, and tandem queuing network; and the Jackson queuing network from INFAMY case studies [1]. All case studies are evaluated on STAMINA and INFAMY, except the genetic toggle switch 1. Experiments are performed on a 3.2 GHz AMD Debian Linux PC with six cores and 64 GB of RAM. For all experiments, the maximal number of iterations N is set to 10, and the reduction factor κ<sup>r</sup> is set to 1000. All experiments terminate due to (Pmax <sup>−</sup> <sup>P</sup>min) < , where = 10−<sup>3</sup>, before they reach <sup>N</sup>. STAMINA is freely available at: https://github.com/formal-verification-research/stamina.

We compare the runtime, state size, and verification results between STAMINA and INFAMY using the same precision = 10−<sup>3</sup>. For all tables in this section, column κ reports the probability estimate threshold used to terminate state generation in STAMINA. The state space size is listed in column |G|(K), where K indicates one thousand states. Column T(C/A) reports the state space construction (C) and analysis (A) time in seconds. For STAMINA, the total construction and analysis time is the cumulation of runtime for all κ values for a model configuration. Columns Pmin and Pmax list the lower and upper probability bounds for the property under verification, and column P lists the single probability value (within the precision ) reported by INFAMY. We select the best runtime reported by three configurations of INFAMY. The improvement in state size (column |G|(X)) and runtime (column T(%)) are represented

<sup>1</sup> INFAMY generates arithmetic errors on the genetic toggle switch model.

by the ratio of state count generated by INFAMY to that of STAMINA (higher is better) and percentage improvement in runtime (higher is better), respectively.

**Genetic Toggle Switch.** The genetic toggle switch circuit model has two inputs, aTc and IPTG. It can be set to the OFF state by supplying it with aTc and can be set to the ON state by supplying it with IPTG [20]. Two important properties for a toggle switch circuit are the response time and the failure rate. The first experiments set IPTG to 100 to measure the toggle switch's response time. It should be noted that the input value of 100 molecules of IPTG is chosen to ensure that the circuit switches to the ON state. The later experiments initialize IPTG to 0 to compute the failure rate, i.e., the probability that the circuit changes state erroneously within a cell cycle of 2, 100 s (an approximation of the cell cycle in *E. coli* [24]). Initially, LacI is set to 60 and TetR is set to 0 for both experiments. The CSL property used for both experiments, P=? [true U-<sup>2100</sup> (T etR > <sup>40</sup> <sup>∧</sup> LacI < 20)], describes the probability of the circuit switching to the ON state within a cell cycle of 2, 100 s. The ON state is defined as LacI below 20 and TetR above 40 molecules.


**Table 1.** Verification results for genetic toggle switch.

The property-agnostic state space is generated with the probability estimate threshold κ = 10−<sup>3</sup>. Table 1 shows large probability bounds: [0, 0.999671] for IPTG = 100 and [0, 0.6975] for IPTG = 0. It is obvious that they are significantly inaccurate w.r.t. the precision of 10−<sup>3</sup>. The κ is then reduced to 10−<sup>6</sup> and state generation switches to the property-guided state expansion mode, where the CSL property is used to guide state exploration, based on the previous state graph. Each state expansion step reduces the κ value by a factor of κ<sup>r</sup> = 1000. To measure the effectiveness of the propertyguided state expansion approach, we compare state graphs generated with and without the property-guided state expansion, as indicated by the "property agnostic" and "property guided" rows in the table. Property-guided state expansion reduces the size of the state space without losing the analysis precision for the same value of κ. Specifically, the state expansion approach reduces the state space by almost 20% for the response rate experiment.

**Robot World.** This case study considers a robot moving in an n-by-n grid and a janitor moving in a larger grid Kn-by-Kn, where the constant K is used to significantly scale up the state space. The robot starts from the bottom left corner to reach the top right corner. The janitor moves around randomly. Either the robot or janitor can occupy one grid location at any given time. The robot also randomly communicates with the base station. The property of interest is the probability that the robot reaches the top right corner within 100 time units while periodically communicating with the base station, encoded as P=? [ (P0.<sup>5</sup> [ true U-<sup>7</sup> communicate ]) <sup>U</sup>-<sup>100</sup> goal ].

Table 2 provides a comparison of results for K = 1024, 64 and n = 64, 32. For smaller grid size i.e, 32-by-32, the robot can reach the goal with a high probability of 97.56%. Where as for a larger value of n = 64 and K = 64, the robot is not able to reach the goal with considerable probability. STAMINA generates precise results that are similar to INFAMY, while exploring less than half of states with shorter runtime.


**Jackson Queuing Network.** A Jackson queuing network consists of N interconnected nodes (queues) with infinite queue capacity. Initially, all queues are considered empty. Each station is connected to a single server which distributes the arrived jobs to different stations. Customers arrive as a Poisson stream with intensity λ for N queues. The model is taken from [10,13]. We compute the probability that, within 10 time units, the first queue has more that 3 jobs and the second queue has more than 5 jobs, given by P=? [ true U-<sup>10</sup> (jobs <sup>1</sup> <sup>4</sup> <sup>∧</sup> jobs <sup>2</sup> 6)].

Table 2 summarizes the results for this model. STAMINA uses roughly equal time to construct and analyze the model for N = 5, whereas INFAMY takes significantly longer to construct the state space, making it slower in overall runtime. For N = 4, STAMINA is faster in generating verification results In both configurations, STAMINA only explores approximately one third of the states explored by INFAMY.

**Cyclic Server Polling System.** This case study is based on a cyclic server attending N stations. We consider the probability that station one is polled within 10 time units, P=? [ true U-<sup>10</sup> station1 polled ]. Table 2 summarizes the verification results for N = 12, 16, 20. The probability of station one being polled within 10 s is 1.0 for all configurations. Similar to previous case studies, STAMINA explores significantly smaller state space. The advantage of STAMINA in terms of runtime starts to manifest as the size of model (and hence the state space size) grows.

**Tandem Queuing Network.** A tandem queuing network is the simplest interconnected queuing network of two finite capacity (c) queues with one server each [18]. Customers join the first queue and enter the second queue immediately after completing the service. This paper considers the probability that the first queue becomes full in 0.25 time units, depicted by the CSL property P=? [ true U-<sup>0</sup>.<sup>25</sup> queue1 full ].

As seen in Table 2, there is almost fifty percent probability that the first queue is full in 0.25 s irrespective of the queue capacity. As in the polling server, STAMINA explores significantly smaller state space. The runtime is similar for model with smaller queue capacity (c = 2047). But the runtime improves as the queue capacity is increased.

#### **4 Conclusions**

This paper presents an infinite-state stochastic model checker, STAMINA, that uses path probability estimates to generate states with high probability and truncate unlikely states based on a specified threshold. Initial state construction is property agnostic, and the state space is used for stochastic model checking of a given CSL property. The calculated probability forms a lower and upper bound on the probability for the CSL property, which is guaranteed to include the actual probability. Next, if finer precision of the probability bound is required, it uses a property-guided state expansion technique to explore states to tighten the reported probability range incrementally. Implementation of STAMINA is built on top of the PRISM model checker with tight integration to its API. Performance and accuracy evaluation is performed on case studies taken from various application domains, and shows significant improvement over the state-of-art infinite-state stochastic model checker INFAMY. For future work, we plan to investigate methods to determine the reduction factor on-the-fly based on the probability bound. Another direction is to investigate heuristics to further improve the property-guided state expansion, as well as, techniques to dynamically remove unlikely states.

**Acknowledgment.** Chris Myers is supported by the National Science Foundation under CCF-1748200. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Dynamical, Hybrid, and Reactive Systems

### **Local and Compositional Reasoning for Optimized Reactive Systems**

Mitesh Jain1(B) and Panagiotis Manolios<sup>2</sup>

<sup>1</sup> Synopsys Inc., Mountain View, USA mitesh.jain@synopsys.com <sup>2</sup> Northeastern University, Boston, USA pete@ccs.neu.edu

**Abstract.** We develop a compositional, algebraic theory of skipping refinement, as well as local proof methods to effectively analyze the correctness of optimized reactive systems. A verification methodology based on refinement involves showing that any infinite behavior of an optimized low-level implementation is a behavior of the high-level abstract specification. Skipping refinement is a recently introduced notion to reason about the correctness of optimized implementations that run faster than their specifications, *i.e.*, a step in the implementation can skip multiple steps of the specification. For the class of systems that exhibit bounded skipping, existing proof methods have been shown to be amenable to mechanized verification using theorem provers and model-checkers. However, reasoning about the correctness of reactive systems that exhibit unbounded skipping using these proof methods requires reachability analysis, significantly increasing the verification effort. In this paper, we develop two new sound and complete proof methods for skipping refinement. Even in presence of unbounded skipping, these proof methods require only local reasoning and, therefore, are amenable to mechanized verification. We also show that skipping refinement is compositional, so it can be used in a stepwise refinement methodology. Finally, we illustrate the utility of the theory of skipping refinement by proving the correctness of an optimized event processing system.

#### **1 Introduction**

Reasoning about the correctness of a reactive system using refinement involves showing that any (infinite) observable behavior of a low-level, optimized implementation is a behavior allowed by the simple, high-level abstract specification. Several notions of refinement like trace containment, (bi)simulation refinement, stuttering (bi)simulation refinement, and skipping refinement [4,10,14,20,22] have been proposed in the literature to directly account for the difference in the abstraction levels between a specification and an implementation. Two attributes of crucial importance that enable us to effectively verify complex reactive systems using refinement are: (1) Compositionality: this allows us to decompose a monolithic proof establishing that a low-level concrete implementation refines a high-level abstract specification into a sequence of simpler refinement proofs, where each of the intermediate refinement proof can be performed independently using verification tools best suited for it; (2) Effective proof methods: analyzing the correctness of a reactive system requires global reasoning about its infinite behaviors, a task that is often difficult for verification tools. Hence it is crucial that the refinement-based methodology also admits effective proof methods that are amenable for mechanized reasoning.

It is known that the (bi)simulation refinement and stuttering (bi)simulation refinement are compositional and support the stepwise refinement methodology [20,24]. Moreover, the proof methods associated with them are local, *i.e.*, they only require reasoning about states and their successors. Hence, they are amenable to mechanized reasoning. However, to the best of our knowledge, it is not known if skipping refinement is compositional. Skipping refinement is a recently introduced notion of refinement for verifying the correctness of optimized implementations that can "execute faster" than their simple high-level specifications, *i.e.*, a step in the implementation can *skip* multiple steps in the specification. Examples of such systems include superscalar processors, concurrent and parallel systems and optimizing compilers. Two proof methods, *reduced well-founded skipping simulation* and *well-founded skipping simulation* have been introduced to reason about skipping refinement for the class of systems that exhibit bounded skipping [10]. These proof methods were used to verify the correctness of several systems that otherwise were difficult to automatically verify using current model-checkers and automated theorem provers. However, when skipping is unbounded, the proof methods in [10] require reachability analysis, and therefore are not amenable to automated reasoning. To motivate the need for alternative proof methods for effective reasoning, we consider the event processing system (EPS), discussed in [10].

#### **1.1 Motivating Example**

An abstract high-level specification, AEPS, of an event processing system is defined as follows. Let E be a set of *events* and V be a set of *state variables*. A *state* of AEPS is a triple t, *Sch*,St, where t is a natural number denoting the current time; *Sch* is a set of pairs e, te, where e ∈ E is an event scheduled to be executed at time t<sup>e</sup> ≥ t; St is an assignment to state variables in V . The transition relation for the AEPS system is defined as follows. If at time t there is no e, t ∈ *Sch*, *i.e.*, there is no event scheduled to be executed at time t, then t is incremented by 1. Otherwise, we (nondeterministically) choose and execute an event of the form e, t ∈ *Sch*. The execution of an event may result in modifying St and also removing and adding a finite number of new pairs e , t to *Sch*. We require that t > t. Finally, execution involves removing the executed event e, t from *Sch*. Now consider, tEPS, an optimized implementation of AEPS. As before, a state is a triple t, *Sch*,St. However, unlike the abstract system which just increments time by 1 when there are no events scheduled at the current time, the optimized system finds the earliest time in future an event is scheduled to execute. The transition relation of tEPS is defined as follows. An event (e, te) with the minimum time is selected, t is updated to t<sup>e</sup> and the event e is executed, as in the AEPS. Consider an execution of AEPS and tEPS in Fig. 1. (We only show the prefix of executions). Suppose at t = 0, *Sch* be {(e1, 0)}. The execution of event e<sup>1</sup> add a new pair (e2, k) to *Sch*, where k is a positive integer. AEPS at t = 0, executes the event e1, adds a new pair (e2, k) to *Sch*, and updates t to 1. Since no events are scheduled to execute before t = k, the AEPS system repeatedly increments t by 1 until t = k. At t = k, it executes the event e2. At time t = 0, tEPS executes e1. The next event is scheduled to execute at time t = k; hence it updates in one step t to k. Next, in one step it executes the event e2. Note that tEPS runs faster than AEPS by *skipping* over abstract states when no event is scheduled for execution at the current time. If k > 1, the step from s<sup>2</sup> to s<sup>3</sup> in tEPS neither corresponds to stuttering nor to a single step of the AEPS. Therefore notions of refinement based on stuttering simulation and bisimulation cannot be used to show that tEPS refines AEPS.

**Fig. 1.** Event simulation system

It was argued in [10] that skipping refinement is an appropriate notion of correctness that directly accounts for the skipping behavior exhibited by tEPS. Though, tEPS was used to motivate the need for a new notion of refinement, the proof methods proposed in [10] are not effective to prove the correctness of tEPS. This is because, execution of an event in tEPS may add new events that are scheduled to execute at an arbitrary time in future, *i.e.*, in general k in the above example execution is unbounded. Hence, the proof methods in [10] would require unbounded reachability analysis which often is problematic for automated verification tools. Even in the particular case when one can a priori determine an upper bound on k and unroll the transition relation, the proof methods in [10] are viable for mechanical reasoning only if the upper bound k is relatively small.

In this paper, we develop local proof methods to effectively analyze the correctness of optimized reactive systems using skipping refinement. These proof methods reduce global reasoning about infinite computations to local reasoning about states and their successor and are applicable even if the optimized implementation exhibits unbounded skipping. Moreover, we show that the proposed proof methods are complete, *i.e.*, if a system M<sup>1</sup> is a skipping refinement of M<sup>2</sup> under a suitable refinement map, then we can always locally reason about them. We also develop an algebraic theory of skipping refinement. In particular, we show that skipping simulation is closed under relational composition. Thus, skipping refinement aligns with the stepwise refinement methodology. Finally, we illustrate the benefits of the theory of skipping refinement and the associated proof methods by verifying the correctness of optimized event processing systems in ACL2s [3].

#### **2 Preliminaries**

A transition system model of a reactive system captures the concept of a state, atomic transitions that modify state during the course of a computation, and what is observable in a state. Any system with a well defined operational semantics can be mapped to a labeled transition system.

**Definition 1 Labeled Transition System.** *A labeled transition system (TS) is a structure* -S,→, L*, where* S *is a non-empty (possibly infinite) set of states,* →⊆ S × S*, is a left-total transition relation (every state has a successor), and* L *is a labeling function whose domain is* S*.*

*Notation:* We first describe the notational conventions used in the paper. Function application is sometimes denoted by an infix dot "." and is left-associative. The composition of relation <sup>R</sup> with itself <sup>i</sup> times (for 0 < i <sup>≤</sup> <sup>ω</sup>) is denoted <sup>R</sup><sup>i</sup> (<sup>ω</sup> <sup>=</sup> <sup>N</sup> and is the first infinite ordinal). Given a relation <sup>R</sup> and 1 < k <sup>≤</sup> <sup>ω</sup>, <sup>R</sup><k denotes - <sup>1</sup>≤i<k <sup>R</sup><sup>i</sup> and <sup>R</sup>≥<sup>k</sup> denotes - ω>i≥<sup>k</sup> <sup>R</sup><sup>i</sup> . Instead of R<ω we often write the more common <sup>R</sup><sup>+</sup>. denotes the disjoint union operator. Quantified expressions are written as -*Q*x: r : t, where *Q* is the quantifier (*e.g.*, ∃, ∀, *min*, -), x is a bound variable, r is an expression that denotes the range of variable x (*true*, if omitted), and t is a term.

Let M = -S, −→, L be a transition system. An M-path is a sequence of states such that for adjacent states, <sup>s</sup> and <sup>u</sup>, <sup>s</sup> <sup>→</sup> <sup>u</sup>. The <sup>j</sup>th state in an <sup>M</sup>-path <sup>σ</sup> is denoted by σ.j. An M-path σ starting at state s is a *fullpath*, denoted by *fp*.σ.s, if it is infinite. An M-segment, v1,...,vk, where k ≥ 1 is a finite M-path and is also denoted by #»<sup>v</sup> . The length of an <sup>M</sup>-segment #»<sup>v</sup> is denoted by <sup>|</sup> #»<sup>v</sup> <sup>|</sup>. Let *INC* be the set of strictly increasing sequences of natural numbers starting at 0. The i th partition of a fullpath <sup>σ</sup> with respect to <sup>π</sup> <sup>∈</sup> *INC*, denoted by <sup>π</sup>σ<sup>i</sup> , is given by an M-segment σ(π.i),...,σ(π(i + 1) − 1).

#### **3 Theory of Skipping Refinement**

In this section we first briefly recall the notion of skipping simulation as described in [10]. We then study the algebraic properties of skipping simulation and show that a theory of refinement based on it is compositional and therefore can be used in a stepwise refinement based verification methodology.

The definition of skipping simulation is based on the notion of *matching*. Informally, a fullpath σ matches a fullpath δ under the relation B iff the fullpaths can be partitioned in to non-empty, finite segments such that all elements in a segment of σ are related to the first element in the corresponding segment of δ.

**Definition 2 smatch** [10]*. Let* M = -S, −→, L *be a transition system,* σ, δ *be fullpaths in* M*. For* π, ξ ∈ *INC and binary relation* B ⊆ S × S*, we define*

$$\begin{aligned} &scorr(B, \sigma, \pi, \delta, \xi) \equiv \langle \forall i \in \omega \ :: \langle \forall s \in {}^{\pi}\sigma^{i} \ :: sB\delta(\xi.i) \rangle \rangle \ and \\ &smatch(B, \sigma, \delta) \equiv \langle \exists \pi, \xi \in INC \mathrel{\colon} \ scorr(B, \sigma, \pi, \delta, \xi) \rangle. \end{aligned}$$

Figure 1 illustrates the notion of matching using our running example: σ is the fullpath of the concrete system and δ is a fullpath of the absract system. (The figure only shows the prefix of the fullpaths). The other parameter for matching is the relation B, which is just the identity function. In order to show that *smatch*(*B*, σ, δ) holds, we have to find π, ξ ∈ *INC* satisfying the definition. In Fig. 1, we separate the partitions induced by our choice for π, ξ using −− and connect elements related by B with . Since all elements of a σ partition are related to the first element of the corresponding δ partition, *scorr* (*B*, σ, π, δ, ξ) holds, therefore, *smatch*(*B*, σ, δ) holds.

Using the notion of matching, skipping simulation is defined as follows. Notice that skipping simulation is defined using a single transition system; it is easy to lift the notion defined on a single transition system to one that relates two transition systems by taking the disjoint union of the transition systems.

**Definition 3 Skipping Simulation (SKS).** B ⊆ S × S *is a skipping simulation on a TS* M = -S, −→, L *iff for all* s, w *such that* sBw*, both of the following hold.*

*(SKS1)* L.s = L.w *(SKS2)* -∀σ : *fp*.σ.s: -∃δ : *fp*.δ.w: *smatch*(*B*, σ, δ)

**Theorem 1.** *Let* M *be a TS. If* B *is a stuttering simulation (STS) on* M *then* B *is an SKS on* M*.*

*Proof:* Follows directly from the definitions of SKS and STS [18].

#### **3.1 Algebraic Properties**

We now study the algebraic properties of SKS. We show that it is closed under arbitrary union. We also show that SKS is closed under relational composition. The later property is particularly useful since it enables us to use stepwise refinement and to modularly analyze the correctness of complex systems.

**Lemma 1.** *Let* M *be a TS and* C *be a set of SKS's on* M*. Then* G = -∪B : B ∈ C : B *is an SKS on* M*.*

**Corollary 1.** *For any TS* M*, there is a greatest SKS on* M*.*

#### **Lemma 2.** *SKS are not closed under negation and intersection.*

The following lemma shows that skipping simulation is closed under relational composition.

**Lemma 3.** *Let* M *be a TS. If* P *and* Q *are SKS's on* M*, then* R = P; Q *is an SKS on* M*.*

*Proof:* To show that R is an SKS on M = -S, −→, L, we show that for any s, w ∈ S such that sRw, SKS1 and SKS2 hold. Let s, w ∈ S and sRw. From the definition of R, there exists x ∈ S such that sP x and xQw. Since P and Q are SKS's on M, L.s = L.x = L.w, hence, SKS1 holds for R.

To prove that SKS2 holds for R, consider a fullpath σ starting at s. Since P and Q are SKSs on M, there is a fullpath τ in M starting at x, a fullpath δ in M starting at w and α, β, θ, γ ∈ *INC* such that *scorr* (*P*, σ, α, τ, β) and *scorr* (*Q*, τ, θ, δ, γ) hold. We use the fullpath δ as a witness and define π, ξ ∈ *INC* such that *scorr* (*R*, σ, π, δ, ξ) holds.

We define a function, r, that given i, corresponding to the index of a partition of τ under β, returns the index of the partition of τ under θ in which the first element of τ 's i th partition under <sup>β</sup> resides. r.i <sup>=</sup> <sup>j</sup> iff θ.j <sup>≤</sup> β.i < θ(<sup>j</sup> + 1). Note that r is indeed a function, as every element of τ resides in exactly one partition of θ. Also, since there is a correspondence between the partitions of α and β, (by *scorr* (*P*, σ, α, τ, β)), we can apply r to indices of partitions of σ under α to find where the first element of the corresponding β partition resides. Note that r is non-decreasing: a<b ⇒ r.a ≤ r.b.

We define πα ∈ *INC*, a strictly increasing sequence that will allow us to merge adjacent partitions in α as needed to define the strictly increasing sequence π on σ used to prove SKS2. Partitions in π will consist of one or more α partitions. Given i, corresponding to the index of a partition of σ under π, the function πα returns the index of the corresponding partition of σ under α.

$$\pi\alpha(0) = 0$$

$$\pi\alpha(i) = \min j \in \omega \text{ s.t. } |\{k : 0 < k \le j \land r.k \ne r(k-1)\}| = i$$

Note that πα is an increasing function, *i.e.*, a<b ⇒ πα(a) < πα(b). We now define π as follows.

$$\pi.i = \alpha(\pi \alpha.i)$$

There is an important relationship between r and πα

$$r(\pi\alpha.i) = \dots = r(\pi\alpha(i+1) - 1)$$

That is, for all α partitions that are in the same π partition, the initial states of the corresponding β partitions are in the same θ partition.

We define ξ as follows: ξ.i = γ(r(πα.i)).

We are now ready to prove SKS2. Let <sup>s</sup> <sup>∈</sup> <sup>π</sup>σ<sup>i</sup> . We show that sRδ(ξ.i). By the definition of π, we have

$$s \in {}^{\alpha}\sigma^{\pi\alpha.i} \vee \dots \vee s \in {}^{\alpha}\sigma^{\pi\alpha(i+1)-1}$$

Hence,

$$sP\tau(\beta(\pi\alpha.i)) \lor \dots \lor sP\tau(\beta(\pi\alpha(i+1)-1))$$

Note that by the definition of r (apply r to πα.i):

$$
\theta(r(\pi\alpha.i)) \le \beta(\pi\alpha.i) < \theta(r(\pi\alpha.i) + 1)
$$

Hence,

τ (β(πα.i))Qδ(γ(r(πα.i))) ∨···∨ τ (β(πα(i + 1) − 1))Qδ(γ(r(πα(i + 1) − 1)))

By the definition of ξ and the relationship between r and πα described above, we simplify the above formula as follows.

$$\tau(\beta(\pi\alpha.i))Q\delta(\xi.i) \vee \dots \vee \tau(\beta(\pi\alpha(i+1)-1))Q\delta(\xi.i)$$

Therefore, by the definition of R, we have that sRδ(ξ.i) holds.

**Theorem 2.** *The reflexive transitive closure of an SKS is an SKS.*

**Theorem 3.** *Given a TS* M*, the greatest SKS on* M *is a preorder.*

*Proof.* Let G be the greatest SKS on M. From Theorem 2, G<sup>∗</sup> is an SKS. Hence G<sup>∗</sup> ⊆ G. Furthermore, since G ⊆ G∗, we have that G = G∗, *i.e.*, G is reflexive and transitive.

#### **3.2 Skipping Refinement**

We now recall the notion of skipping refinement [10]. We use skipping simulation, a notion defined in terms of a single transition system, to define skipping refinement, a notion that relates *two* transition systems: an *abstract* transition system and a *concrete* transition system. Informally, if a concrete system is a skipping refinement of an abstract system, then its observable behaviors are also behaviors of the abstract system, modulo skipping (which includes stuttering). The notion is parameterized by a *refinement map*, a function that maps concrete states to their corresponding abstract states. A refinement map along with a labeling function determines what is observable at a concrete state.

**Definition 4 Skipping Refinement.** *Let* M*<sup>A</sup>* = -SA, <sup>A</sup> −→, LA *and* M*<sup>C</sup>* = -S<sup>C</sup> , <sup>C</sup> −→, L<sup>C</sup> *be transition systems and let r* : *S<sup>C</sup>* → *S<sup>A</sup> be a refinement map. We say* M<sup>C</sup> *is a skipping refinement of* M<sup>A</sup> *with respect to* r*, written* M<sup>C</sup> <sup>r</sup> MA*, if there exists a binary relation* B *such that all of the following hold.*


Next, we use the property that skipping simulation is closed under relational composition to show that skipping refinement supports modular reasoning using a stepwise refinement approach. In order to verify that a low-level complex implementation M<sup>C</sup> refines a simple high-level abstract specification M<sup>A</sup> one proceeds as follows: starting with M<sup>A</sup> define a sequence of intermediate systems leading to the final complex implementation M<sup>C</sup> . Any two successive systems in the sequence differ only in relatively few aspects of their behavior. We then show that, at each step in the sequence, the system at the current step is a refinement of the previous one. Since at each step, the verification effort is focused only on the few differences in behavior between two systems under consideration, proof obligations are simpler than the monolithic proof. Note that this methodology is orthogonal to (horizontal) modular reasoning that infers the correctness of a system from the correctness of its sub-components.

**Theorem 4.** *Let* M*<sup>1</sup>* = -S1, <sup>1</sup> −→, L1*,* M*<sup>2</sup>* = -S2, <sup>2</sup> −→, L2*, and* M*<sup>3</sup>* = -S3, <sup>3</sup> −→, L3 *be TSs,* p : S<sup>1</sup> → S<sup>2</sup> *and* r : S<sup>2</sup> → S3*. If* M<sup>1</sup> <sup>p</sup> M<sup>2</sup> *and* M<sup>2</sup> <sup>r</sup> M3*, then* M<sup>1</sup> <sup>p</sup>;<sup>r</sup> M3*.*

*Proof:* Since M<sup>1</sup> <sup>p</sup> M2, we have an SKS, say A, such that -∀s ∈ S<sup>1</sup> :: sA(p.s). Furthermore, without loss of generality we can assume that A ⊆ S<sup>1</sup> × S2. Similarly, since M<sup>2</sup> <sup>r</sup> M3, we have an SKS, say B, such that -∀s ∈ S<sup>2</sup> :: sB(r.s) and B ⊆ S<sup>2</sup> × S3. Define C = A; B. Then we have that C ⊆ S<sup>1</sup> × S<sup>3</sup> and -∀s ∈ S<sup>1</sup> :: sCr(p.s). Also, from Theorem 2, C is an SKS on -<sup>S</sup><sup>1</sup> <sup>S</sup>3, <sup>1</sup> −→ <sup>3</sup> −→,L, where L.s = L3(s) if s ∈ S<sup>3</sup> else L.s = L3(r(p.s)).

Formally, to establish that a complex low-level implementation M<sup>C</sup> refines a simple high-level abstract specification MA, one defines intermediate systems M1,...Mn, where n ≥ 1 and establishes the following: M<sup>C</sup> = M<sup>0</sup> <sup>r</sup><sup>0</sup> M<sup>1</sup> r1 ... <sup>r</sup>*n*−<sup>1</sup> M<sup>n</sup> = MA. Then from Theorem 4, we have that M<sup>C</sup> <sup>r</sup> MA, where r = r0; r1; ... ; r<sup>n</sup>−<sup>1</sup>. We illustrate the utility of this approach in Sect. 5 by proving the correctness of an optimized event processing systems.

**Theorem 5.** *Let* M = -S, −→, L *be a TS. Let* M = -S , −→- , L *where* S ⊆ S*,* −→- ⊆ S × S *,* −→- *is a left-total subset of* −→*+, and* <sup>L</sup> <sup>=</sup> <sup>L</sup>|<sup>S</sup>- *. Then* M -<sup>I</sup> M*, where* I *is the identity function on* S *.*

**Corollary 2.** *Let* M*<sup>C</sup>* = -S<sup>C</sup> , <sup>C</sup> −→, L<sup>C</sup> *and* M*<sup>A</sup>* = -SA, <sup>A</sup> −→, LA *be TSs,* r : S<sup>C</sup> → S<sup>A</sup> *be a refinement map. Let* M <sup>C</sup> = -S <sup>C</sup> , <sup>C</sup> −→- , L <sup>C</sup> *where* S <sup>C</sup> <sup>⊆</sup> <sup>S</sup><sup>C</sup> *,* <sup>C</sup> −→- *is a left-total subset of* <sup>C</sup> −→*+, and* <sup>L</sup> <sup>C</sup> = L<sup>C</sup> |<sup>S</sup>- *<sup>C</sup> . If* M<sup>C</sup> <sup>r</sup> M<sup>A</sup> *then* M <sup>C</sup> r- MA*, where* r *is* r|<sup>S</sup>- *C .*

We now illustrate the usefulness of the theory of skipping refinement using our running example of event processing systems. Consider MPEPS, that uses a priority queue to find a non-empty set of events (say Et) scheduled to execute at the current time and executes them. We allow the priority queue in MPEPS to be deterministic or nondeterministic. For example, the priority queue may deterministically select a single event in E<sup>t</sup> to execute, or based on considerations such as resource utilization it may execute some subset of events in E<sup>t</sup> in a single step. When reasoning about the correctness of MPEPS, one thing to notice is that there is a difference in the data structures used in the two systems: MPEPS uses a priority queue to effectively find the next set of events to execute in the scheduler, while AEPS uses a simple abstract set representation for the scheduler. Another thing to notice is that MPEPS can "execute faster" than AEPS in two ways: it can increment time by more than 1 and it can execute more than one event in a single step. The theory of skipping refinement developed in this paper enables us to separate out these concerns and apply a stepwise refinement approach to effectively analyse MPEPS.

First, we account for the difference in the data structures between MPEPS and AEPS. Towards this we define an intermediate system MEPS that is identical to MPEPS except that the scheduler in MEPS is now represented as a set of event-time pairs. Under a refinement map, say p, that extracts the set of eventtime pairs in the priority queue of MPEPS, a step in MPEPS can be matched by a step in MEPS. Hence, MPEPS <sup>p</sup> MEPS. Next we account for the difference between MEPS and AEPS in the number of events the two systems may execute in a single step. Towards this, observe that the state space of MEPS and tEPS are equal and the transition relation of MEPS is a left-total subset of the transitive closure of the transition relation of tEPS. Hence, from Theorem 5, we infer that MPEPS is a skipping refinement of tEPS using the identity function, say I1, as the refinement map, *i.e.*, MEPS -<sup>I</sup><sup>1</sup> tEPS. Next observe that the state spaces of tEPS and AEPS are equal and the transition relation of tEPS is a left-total subset of the transitive closure of the transition relation of AEPS. Hence, from Theorem 5, tEPS is a skipping refinement of AEPS using the identity function, say I2, as the refinement map, *i.e.*, tEPS -<sup>I</sup><sup>2</sup> AEPS. Finally, from the transitivity of skipping refinement (Theorem 4), we conclude that MPEPS p- AEPS, where p = p; I1; I2.

#### **4 Mechanised Reasoning**

To prove that a transition system M<sup>C</sup> is a skipping refinement of a transition system M<sup>A</sup> using Definition 3, requires us to show that for any fullpath from M<sup>C</sup> we can find a matching fullpath from MA. However, reasoning about existence of infinite sequences can be problematic using automated tools. In this section, we develop sound and complete local proof methods that are applicable even if a system exhibits unbounded skipping. We first briefly present the proof methods, reduced well-founded skipping and well-founded skipping simulation, developed in [10].

**Definition 5 Reduced Well-founded Skipping** [10]*.* B ⊆ S ×S *is a reduced well-founded skipping relation on TS* M = -S, −→, L *iff:*

*(RWFSK1)* -∀s, w ∈ S : sBw : L.s = L.w *(RWFSK2) There exists a function, rankt* : S × S → W*, such that* -W, ≺ *is well-founded and*

$$\begin{aligned} \langle \forall s, u, w \in S: s \to u \land sBw: \\ (a) \ (uBw \land rank(u, w) \prec rank(s, w)) \lor \\ (b) \ \langle \exists v: w \to^+ v: uBv \rangle \rangle \end{aligned}$$

**Definition 6 Well-founded Skipping** [10]**.** B ⊆ S ×S *is a well-founded skipping relation on TS* M = -S, −→, L *iff:*

	- -∀s,u, w ∈ S : s −→ u ∧ sBw : *(a)* -∃v : w −→ v : uBv ∨ *(b)* (uBw ∧ *rankt*(u, w) ≺ *rankt*(s, w)) ∨ *(c)* -∃v : w −→ v : sBv ∧ *rankl*(v, s, u) < *rankl*(w, s, u) ∨ *(d)* -<sup>∃</sup><sup>v</sup> : <sup>w</sup> <sup>→</sup><sup>≥</sup><sup>2</sup> <sup>v</sup> : uBv

**Theorem 6** [10]*. Let* M = -S, −→, L *be a TS and* B ⊆ S × S*. The following statements are equivalent*


Recall the event processing systems AEPS and tEPS described in Sect. 1.1. When no events are scheduled to execute at a given time, say t, tEPS increments time t to the earliest time in future, say k>t, at which an event is scheduled for execution. Execution of an event can add an event that is scheduled to be executed at an arbitrary time in future. Therefore, we cannot apriori determine an upper-bound on k. Using any of the above two proof-methods to reason about skipping refinement would require unbounded reachability analysis (conditions RWFSK2b and WFSK2d), often difficult for automated verification tools. To redress the situation, we develop two new proof methods of SKS; both require only local reasoning about steps and their successors.

**Definition 7 Reduced Local Well-founded Skipping.** B ⊆ S ×S *is a local well-founded skipping relation on TS* M = -S, −→, L *iff:*

*(RLWFSK1)* -∀s, w ∈ S : sBw : L.s = L.w *(RLWFSK2) There exist functions, rankt* : S × S −→ W*, rankls* : S × S −→ ω *such that* -W, ≺ *is well founded, and, a binary relation* O ⊆ S ×S

$$\begin{aligned} &such \; that \\ &\quad \langle \forall s, u, w \in S: sBw \land s \to u: \\ &\quad (a) \; (uBw \land \operatorname{rank}(u, w) \prec \operatorname{rank}(s, w)) \; \vee \\ &\quad (b) \; \langle \exists v: w \to v: u\mathcal{O}v \rangle \rangle \\ &\quad \text{and} \\ &\quad \langle \forall x, y \in S: x\mathcal{O}y: \\ &\quad (c) \; xBy \; \vee \\ &\quad (d) \; \langle \exists z: y \to z: x\mathcal{O}z \land \operatorname{rank}(z, x) < \operatorname{rank}(y, x) \rangle \rangle \end{aligned}$$

Observe that to prove that a relation is an RLWFSK on a transition system, it is sufficient to reason about single steps of the transition system. Also, note that RLWFSK does not differentiate between skipping and stuttering on the right. This is based on an earlier observation that skipping subsumes stuttering. We used this observation to simplify the definition. However, it can often be useful to differentiate between skipping and stuttering. Next we define local well-founded skipping simulation (LWFSK), a characterization of skipping simulation that separates reasoning about skipping and stuttering on the right (Fig. 2).

**Fig. 2.** Local well-founded skipping simulation (orange line indicates the states are related by *B* and blue line indicate the states are related by *O*) (Color figure online)

**Definition 8 Local Well-founded Skipping.** B ⊆ S × S *is a local wellfounded skipping relation on TS* M = -S, −→, L *iff:*

*(LWFSK1)* -∀s, w ∈ S : sBw : L.s = L.w *(LWFSK2) There exist functions, rankt* : S ×S −→ W*, rankl* : S ×S ×S −→ ω*, and rankls* : S × S −→ ω *such that* -W, ≺ *is well founded, and, a* *binary relation* O ⊆ S × S *such that*


Like RLWFSK, to prove that a relation is a LWFSK, reasoning about single steps of the transition system suffices. However, LWFSK2b accounts for stuttering on the right, and LWFSK2d along with LWFSK2e and LWFSK2f accounts for skipping on the right. Also observe that states related by O are not required to be labeled identically and may have no observable relationship to the states related by B.

**Soundness and Completeness.** We next show that RLWFSK and LWFSK in fact completely characterize skipping simulation, *i.e.*, RLWFSK and LWFSK are sound and complete proof rules. Thus if a concrete system M<sup>C</sup> is a skipping refinement of MA, one can always effectively reason about it using RLWFSK and LWFSK.

**Theorem 7.** *Let* M = -S, −→, L *be a transition system and* B ⊆ S × S*. The following statements are equivalent:*


*Proof:* The equivalence of (i), (ii) and (iii) follows from Theorem 6. That (iv) implies (v) follows from the simple observation that RLWFSK2 implies LWFSK2. To complete the proof, we prove the following two implications. We prove below that (v) implies (ii) in Lemma 4 and that (iii) implies (iv) in Lemma 5.

**Lemma 4.** *If* B *is a LWFSK on* M*, then* B *is a WFSK on* M*.*

*Proof.* Let B be a LWFSK on M. WFSK1 follows directly from LWFSK1. Let *rankt*, *rankl*, and *rankls* be functions, and O be a binary relation such that LWFSK2 holds. To show that WFSK2 holds, we use the same *rankt* and *rankl* functions and let s, u, w ∈ S and s → u and sBw. LWFSK2a, LWFSK2b and LWFSK2c are equivalent to WFSK2a, WFSK2b and WFSK2c, respectively, so we show that if only LWFSK2d holds, then WFSK2d holds. Since LWFSK2d holds, there is a successor v of w such that uOv. Since uOv holds, either LWFSK2e or LWFSK2f must hold between u and v. However, since LWFSK2a does not hold, LWFSK2e cannot hold and LWFSK2f must hold, *i.e.*, there exists a successor v of v such that uOv ∧ *rankls*(v , u) < *rankls*(v, u). So, we need a path of at least 2 steps from w to satisfy the universally quantified constraint on O. Let us consider an arbitrary path, δ, such that δ.0 = w, δ.1 = v, δ.2 = v , uOδ.i, LWFSK2e does not hold between u and δ.i for i ≥ 1, and rankls(δ.(i + 1), u) < rankls(δ.i, u). Notice that any such path must be finite because rankls is well founded. Hence, δ is a finite path and there exists a k ≥ 2 such that LWFSK2e holds between u and δ.k. Therefore, WFSK2d holds, *i.e.*, there is a state in δ reachable from w in two or more steps which is related to u by B.

### **Lemma 5.** *If* B *is RWFSK on* M*, then* B *is an RLWFSK on* M*.*

*Proof.* Let B be an RWFSK on M. RLWFSK1 follows directly from RWFSK1. To show that RLWFSK2 holds, we use any *rankt* function that can be used to show that RWFSK2 holds. We define O as follows.

$$\mathcal{O} = \{(u, v) : \langle \exists z : v \to^+ z : uBz \rangle\}$$

We define *rankls*(u, v) to be the minimal length of a M-segment that starts at v and ends at a state, say z, such that uBz, if such a segment exists and 0 otherwise. Let s, u, w ∈ S, sBw and s → u. If RWFSK2a holds between s, u, and w, then RLWFSK2a also holds. Next, suppose that RWFSK2a does not hold but RWFSK2b holds, *i.e.*, there is an M-segment w, a, . . . , v such that uBv; therefore, uOa and RLWFSK2b holds.

To finish the proof, we show that O and rankls satisfy the constraints imposed by the second conjunct in RLWFSK2. Let x, y ∈ S, xOy and x B y . From the definition of O, we have that there is an M-segment from y to a state related to x by <sup>B</sup>; let #»<sup>y</sup> be such a segment of minimal length. From definition of *rankls*, we have *rankls*(y, x) = | #»<sup>y</sup> <sup>|</sup>. Observe that <sup>y</sup> cannot be the last state of #»<sup>y</sup> and <sup>|</sup> #»<sup>y</sup> | ≥ 2. This is because the last state in #»<sup>y</sup> must be related to <sup>x</sup> by <sup>B</sup>, but from the assumption we know that <sup>x</sup> B y . Let <sup>y</sup> be a successor of <sup>y</sup> in #»<sup>y</sup> . Clearly, <sup>x</sup>Oy ; therefore, *rankls*(y , x) < | #»<sup>y</sup> |−1, since the length of a minimal <sup>M</sup>-segment from <sup>y</sup> to a state related to x by B, must be less or equal to | #»<sup>y</sup> | − 1.

#### **5 Case Study (Event Processing System)**

In this section, we analyze the correctness of an optimized event processing system (PEPS) that uses a *priority queue* to find an event scheduled to execute at any given time. We show that PEPS refines AEPS, a simple event processing system described in Sect. 1. Our goal is to illustrate the benefits of the theory of skipping refinement and the associated local proof methods developed in the paper. We use ACL2s [3], an interactive theorem prover, to define the operational semantics of the systems and mechanize a proof of its correctness.

**Operational Semantics of PEPS:** A state of PEPS system is a triple tm, otevs, mem, where tm is a natural number denoting current time, otevs is a set of timed-event pairs denoting the scheduler that is ordered with respect to a total order te-< on timed-event pairs, and mem is a collection of variable-integer pairs denoting the shared memory. The transition function of PEPS is defined as follows: if there are no events in otevs, then PEPS just increments the current time by 1. Otherwise, it picks the first timed-event pair, say e, t in otevs, executes it and updates the time to t. The execution of an event may result in adding new timed-events to the scheduler, removing existing timed-events from the scheduler and updating the memory. Finally, the executed timed-event is removed from the scheduler. This is a simple, generic model of an event processing system. Notice that the ability to remove events can be used to specify systems with preemption [23]: an event scheduled to execute at some future time may be canceled (and possibly rescheduled to be executed at a different time in future) as a result of the execution of an event that preempts it. Notice that, for a given total order, PEPS is a deterministic system.

The execution of an event is modeled using three constrained functions that take as input an event, ev, a time, t, and a memory, mem: step-events-add returns the set of new timed-event pairs to add to the scheduler; step-events-rm returns the set of timed-event pairs to remove from the scheduler; and step-memory returns a memory updated as specified by the event. We place minimal constraints on these functions. For example, we only require that step-events-add returns a set of event-time pairs of the form e, te where t<sup>e</sup> is greater than the current time t. The constrained functions are defined using the encapsulate construct in ACL2 and can be instantiated with any executable definitions that satisfy these constraints without affecting the proof of correctness of PEPS. Moreover, note that the particular choice of the total order on timed-event pairs is irrelevant to the proof of correctness of PEPS.

**Stepwise Refinement:** We show that PEPS refines AEPS using a stepwise refinement approach: first we define an intermediate system HPEPS obtained by augmenting PEPS with history information and show that PEPS is a simulation refinement of HPEPS. Second, we show that HPEPS is a skipping refinement of AEPS. Finally, we appeal to Theorems 1 and 4 to infer that PEPS refines AEPS. Note that the compositionality of skipping refinement enables us to decompose the proof into a sequence of refinement proofs, each of which is simpler. Moreover, the history information in HPEPS is helpful in defining the witnessing binary relation and the rank function required to prove skipping refinement.

An HPEPS state is a four-tuple tm, otevs, mem, h, where tm, otevs, mem are respectively the current time, an ordered set of timed events and a collection of variable-integer pairs, and h is the history information. The history information h consists of a Boolean variable valid, time tm, and an ordered set of timed-event pairs otevs and the memory mem. Intuitively, h records the state preceding the current state. The transition function HPEPS is same as the transition function of PEPS except that HPEPS also records the history in h.

**PEPS Refines HPEPS:** Observe that, modulo the history information, a step of PEPS directly corresponds to a step of HPEPS, *i.e.*, PEPS is a bisimulation refinement of HPEPS under a refinement map that projects a PEPS state tm, otevs, mem to the HPEPS state tm, otevs, mem, h where the valid component of h is set to false. But we only prove that it is a simulation refinement, because, from Theorem 1, it suffices to establish that PEPS is a skipping refinement of HPEPS. The proofs primarily require showing that two sets of ordered timed-events that are set equivalent are in fact equal and that adding and removing equivalent sets of timed-event from equal schedulers results in equal schedulers.

**HPEPS Refines AEPS:** Next we show that HPEPS is a skipping refinement of AEPS under the refinement map R, a function that simply projects an HPEPS state to an AEPS state. To show that HPEPS is a skipping refinement of AEPS under the refinement map R, from Definition 4, we must show as witness a binary relation B that satisfies the two conditions. Let B = {(s, R.s) : s *is an HPEPS state*}. To establish that B is an SKS on the disjoint union of HPEPS and AEPS, we have a choice of four proof-methods (Sect. 4). Recall that execution of an event can add a new event scheduled to be executed at an arbitrary time in the future. As a result, if we were to use WFSK or RWFSK, the proof obligations from conditions WFSK2d (Definition 5) and RWFSK2b (Definition 6) would require unbounded reachability analysis, something that typically places a big burden on verification tools and their users. In contrast, the proof obligations to establish RLWFSK are local and only require reasoning about states and their successors, which significantly reduces the proof complexity.

RLWFSK1 holds trivially. To prove that RLWFSK2 holds we define a binary relation O and a rank function *rankls* and show that they satisfy the two universally quantified formulas in RLWFSK2. Moreover, since HPEPS does not stutter we ignore RLWFSK2a, and that is why we do not define *rankt*. Finally, our proof obligation is: for all HPEPS s, u and AEPS state w such that s → u and sBw holds, there exists a AEPS state v such that w → v and uOv holds.

**Verification Effort:** We used the defdata framework in ACL2s, to specify the data definitions for the three systems and the definec construct to introduce function definitions along with their input-contracts (pre-conditions) and output-contracts (post-conditions). In addition to admitting a data definition, defdata proves several theorems about the functions that are extremely helpful in automatically discharging type-like proof obligations. We also developed a library to concisely describe functions using higher-order constructs like map and reduce, which made some of the definitions clearer. ACL2s supports first-order quantifiers via the defun-sk construct, which essentially amounts to the use of Hilbert's choice operator. We use defun-sk to model the transition relation for AEPS (a non-deterministic system) and to specify the proof obligations for proving that HPEPS refines AEPS. However, support for automated reasoning about quantifiers is limited in ACL2. Therefore, we use the domain knowledge, when possible (*e.g.*, a system is deterministic), to eliminate quantifiers in the proof obligations and provide explicit witnesses for existential quantifiers.

The proof makes essential use of several libraries available in ACL2 for reasoning about lists and sets. In addition, we prove a collection of additional lemmas that can be roughly categorized into four categories. First, we have a collection of lemmas to prove the input-output contracts of the functions. Second, we have a collection of lemmas to show that operations on the schedulers in the three systems preserve various invariants, *e.g.*, that any timed-event in the scheduler is scheduled to execute at a time greater or equal to the current time. Third, we have a collection of lemmas to show that inserting and removing two equivalent sets of timed-events from a scheduler results in an equivalent scheduler. And fourth, we have a collection of lemmas to show that two schedulers are equivalent *iff* they are set equal. The above lemmas are used to establish a relationship between priority queues, a data structure used by the implementation system, and sets, the corresponding data structure used in the specification system. The behavioral difference between the two systems is accounted for by the notion of skipping refinement. This separation significantly eases understanding as well as mechanical reasoning about the correctness of reactive systems. We have 8 top-level proof obligations and a few dozen supporting lemmas. The entire proof takes about 120 s on a machine with 2.2 GHz Intel Core i7 with 16 GB main memory.

#### **6 Related Work**

Several notions of correctness have been proposed in the literature and their properties been widely studied [2,5,11,16,17]. In this paper, we develop a theory of skipping refinement to effectively prove the correctness of optimized reactive systems using automated verification tools. These results establish skipping refinement on par with notions of refinement based on (bi)simulation [22] and stuttering (bi)simulation [20,24], in the sense that skipping refinement is (1) compositional and (2) admits local proofs methods. Together the two properties have been instrumental in significantly reducing the proof complexity in verification of large and complex systems. We developed the theory of skipping refinement using a generic model of transition systems and place no restrictions on the state space size or the branching factor of the transition system. Any system with a well-defined operational semantics can be mapped to a labeled transition system. Moreover, the local proof methods are sound and complete, *i.e.*, if an implementation is a skipping refinement of the specification, we can always use the local proof methods to effectively reason about it.

Refinement-based methodologies have been successfully used to verify the correctness of several realistic hardware and software systems. In [13], several complex concurrent programs were verified using a stepwise refinement methodology. In addition, Kragl and Qadeer [13] also develop a compact representation to facilitate the description of programs at different levels of abstraction and associated refinement proofs. Several back-end compiler transformations are proved correct in Compcert [15] using simulation refinement. In [25], several compiler transformations were verified using stuttering refinement and associated local proof methods. Recently, refinement-based methodology has also been applied to verify the correctness of practical distributed systems [8] and a generalpurpose operating system microkernel [12]. The full verification of CertiKOS [6,7], an OS kernel, is based on the notion of simulation refinement. Refinement based approaches have also been extensively used to verify microprocessor designs [1,9,19,21,26]. Skipping refinement was used to verify the correctness of optimized memory controllers and a JVM-inspired stack machine [10].

#### **7 Conclusion and Future Work**

In this paper, we developed the theory of skipping refinement. Skipping refinement is designed to reason about the correctness of optimized reactive systems, a class of systems where a single transition in a concrete low-level implementation may correspond to a sequence of observable steps in the corresponding abstract high-level specification. Examples of such systems include optimizing compilers, concurrent and parallel systems and superscalar processors. We developed sound and complete proof methods that reduce global reasoning about infinite computations of such systems to local reasoning about states and their successors. We also showed that the skipping simulation is closed under composition and therefore is amenable to modular reasoning using a stepwise refinement approach. We experimentally validated our results by analyzing the correctness of an optimized event-processing system in ACL2s. For future work, we plan to precisely classify temporal logic properties that are preserved by skipping refinement. This would enable us to transfer temporal properties from specifications to implementations, after establishing refinement.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Robust Controller Synthesis in Timed B¨uchi Automata: A Symbolic Approach**

Damien Busatto-Gaston1(B), Benjamin Monmege<sup>1</sup>, Pierre-Alain Reynier<sup>1</sup>, and Ocan Sankur<sup>2</sup>

<sup>1</sup> Aix Marseille Univ, Universit´e de Toulon, CNRS, LIS, Marseille, France {damien.busatto,pierre-alain.reynier}@lis-lab.fr, benjamin.monmege@univ-amu.fr <sup>2</sup> Univ Rennes, Inria, CNRS, IRISA, Rennes, France ocan.sankur@irisa.fr

**Abstract.** We solve in a purely symbolic way the robust controller synthesis problem in timed automata with B¨uchi acceptance conditions. The goal of the controller is to play according to an accepting lasso of the automaton, while resisting to timing perturbations chosen by a competing environment. The problem was previously shown to be PSPACEcomplete using regions-based techniques, but we provide a first tool solving the problem using zones only, thus more resilient to state-space explosion problem. The key ingredient is the introduction of branching constraint graphs allowing to decide in polynomial time whether a given lasso is robust, and even compute the largest admissible perturbation if it is. We also make an original use of constraint graphs in this context in order to test the inclusion of timed reachability relations, crucial for the termination criterion of our algorithm. Our techniques are illustrated using a case study on the regulation of a train network.

#### **1 Introduction**

Timed automata [1] extend finite-state automata with timing constraints, providing an automata-theoretic framework to design, model, verify and synthesise real-time systems. However, the semantics of timed automata is a mathematical idealisation: it assumes that clocks have infinite precision and instantaneous actions. Proving that a timed automaton satisfies a property does not ensure that a real implementation of it also does. This *robustness* issue is a challenging problem for embedded systems [12], and alternative semantics have been proposed, so as to ensure that the verified (or synthesised) behaviour remains correct in presence of small timing perturbations.

We are interested in a fundamental controller synthesis problem in timed automata equipped with a B¨uchi acceptance condition: it consists in determining whether there exists an accepting infinite execution.

This work was funded by ANR project Ticktac (ANR-18-CE40-0015).

Thus, the role of the controller is to choose transitions and delays. This problem has been studied numerously in the exact setting [13–15,17,19,27,28]. In the context of robustness, this strategy should be tolerant to small perturbations of the delays. This discards strategies suffering from weaknesses such as Zeno behaviours, or even non-Zeno behaviours requiring infinite precision, as exhibited in [6].

More formally, the semantics we consider is defined as a game that depends on some parameter δ representing an upper bound on the amplitude of the perturbation [7]. In this game, the controller plays against an antagonistic environment that can perturb each delay using a value chosen in the interval [−δ, δ]. The case of a fixed value of δ has been shown to be decidable in [7], and also for a related model in [18]. However, these algorithms are based on regions, and as the value of δ may be very different from the constants appearing in the guards of the automaton, do not yield practical algorithms. Moreover, the maximal perturbation is not necessarily known in advance, and could be considered as part of the design process.

The problem we are interested in is *qualitative*: we want to determine whether *there exists* a positive value of δ such that the controller wins the game. It has been proven in [25] that this problem is in PSPACE (and even PSPACE-complete), thus no harder than in the exact setting with no perturbation allowed [1]. However, the algorithm heavily relies on regions, and more precisely on an abstraction that refines the one of regions, namely folded orbit graphs. Hence, it is not at all amenable to implementation.

Our objective is to provide an efficient symbolic algorithm for solving this problem. To this end, we target the use of *zones* instead of regions, as they allow an on-demand partitioning of the state space. Moreover, the algorithm we develop explores the reachable state-space in a *forward* manner. This is known to lead to better performances, as witnessed by the successful tool UPPAAL TIGA based on forward algorithms for solving controller synthesis problems [5].

Our algorithm can be understood as an adaptation to the robustness setting of the standard algorithm for B¨uchi acceptance in timed automata [17]. This algorithm looks for an accepting lasso using a double depth-first search. A major difficulty consists in checking whether a lasso can be robustly iterated, i.e. whether there exists δ > 0 such that the controller can follow the cycle for an infinite amount of steps while being tolerant to perturbations of amplitude at most δ. The key argument of [25] was the notion of aperiodic folded orbit graph of a path in the region automaton, thus tightly connected to regions. Lifting this notion to zones seems impossible as it makes an important use of the fact that valuations in regions are time-abstract bisimilar, which is not the case for zones.

Our contributions are threefold. First, we provide a polynomial time procedure to decide, given a lasso, whether it can be robustly iterated. This symbolic algorithm relies on a computation of the greatest fixpoint of the operator describing the set of controllable predecessors of a path. In order to provide an argument of termination for this computation, we resort to a new notion of branching constraint graphs, extending the approach used in [16,26] and based

**Fig. 1.** A timed automaton

on constraint graphs (introduced in [8]) to check iterability of a cycle, without robustness requirements. Second, we show that when considering a lasso, not only can we decide robust iterability, but we can even compute the largest perturbation under which it is controllable. This problem was not known to be decidable before. Finally, we provide a termination criterion for the analysis of lassos. Focusing on zones is not complete: it can be the case that two cycles lead to the same zones, but one is robustly iterable while the other one is not. Robust iterability crucially depends on the real-time dynamics of the cycle and we prove that it actually only depends on the reachability relation of the path. We provide a polynomial-time algorithm for checking inclusion between reachability relations of paths in timed automata based on constraint graphs. It is worth noticing that all our procedures can be implemented using difference bound matrices, a very efficient data structure used for timed systems. These developments have been integrated in a tool, and we present a case study of a train regulation network illustrating its performances.

Integrating the robustness question in the verification of real-time systems has attracted attention in the community, and the recent works include, for instance, robust model checking for timed automata under clock drifts [23], Lipschitz robustness notions for timed systems [11], quantitative robust synthesis for timed automata [2]. Stability analysis and synthesis of stabilizing controllers in hybrid systems are a closely related topic, see e.g. [20,21].

#### **2 Timed Automata: Reachability and Robustness**

Let X = {x1,...,xn} be a finite set of clock variables. It is extended with a virtual clock x0, constantly equal to 0, and we denote by X<sup>0</sup> the set X ∪{x0}. An atomic clock constraint on X is a formula x − y k, or x − y<k with <sup>x</sup> <sup>=</sup> <sup>y</sup> ∈ X<sup>0</sup> and <sup>k</sup> <sup>∈</sup> <sup>Q</sup>. A constraint is non-diagonal if one of the two clocks is x0. We denote by Guards(X) (respectively, Guardsnd(X)) the set of (clock) constraints (respectively, non-diagonal clock constraints) built as conjunctions of atomic clock constraints (respectively, non-diagonal atomic clock constraints).

A clock valuation ν is an element of R<sup>X</sup> -<sup>0</sup>. It is extended to <sup>R</sup><sup>X</sup><sup>0</sup> -<sup>0</sup> by letting <sup>ν</sup>(x0) = 0. For all <sup>d</sup> <sup>∈</sup> <sup>R</sup><sup>&</sup>gt;0, we let <sup>ν</sup> <sup>+</sup> <sup>d</sup> be the valuation defined by (<sup>ν</sup> <sup>+</sup> d)(x) = ν(x) + d for all clocks x ∈ X . If Y⊆X , we also let ν[Y ← 0] be the valuation resetting clocks in Y to 0, without modifying values of other clocks. A valuation ν satisfies an atomic clock constraint x − y k (with ∈ {-, <}) if ν(x) − ν(y) k. The satisfaction relation is then extended to clock constraints naturally: the satisfaction of constraint g by a valuation ν is denoted by ν |= g. The set of valuations satisfying a constraint g is denoted by g.

A *timed automaton* is a tuple A = (L, 0,E,Lt) with L a finite set of locations, <sup>0</sup> ∈ L an initial location, E ⊆ L × Guardsnd(X ) × 2<sup>X</sup> × L is a finite set of edges, and L<sup>t</sup> is a set of accepting locations.

An example of timed automaton is depicted in Fig. 1, where the reset of a clock x is denoted by x := 0. The semantics of the timed automaton A is defined as an infinite transition system -A = (S, s0,→). The set S of states of -A is <sup>L</sup> <sup>×</sup> <sup>R</sup><sup>X</sup> -<sup>0</sup>, s<sup>0</sup> = (0, **0**). A transition of -<sup>A</sup> is of the form (, ν) e,d −−→ ( , ν ) with e = (, g,Y, ) <sup>∈</sup> <sup>E</sup> and <sup>d</sup> <sup>∈</sup> <sup>R</sup><sup>&</sup>gt;<sup>0</sup> such that <sup>ν</sup> <sup>+</sup> <sup>d</sup> <sup>|</sup><sup>=</sup> <sup>g</sup> and <sup>ν</sup> = (<sup>ν</sup> <sup>+</sup> <sup>d</sup>)[Y ← 0]. We call *path* a possible finite sequence of edges in the timed automaton. The *reachability relation* of a path ρ, denoted by Reach(ρ) is the set of pairs (ν, ν ) such that there is a sequence of transitions of -A starting from (, ν), ending in ( , ν ) and that follows ρ in order as the edges of the timed automaton. A *run* of A is an infinite sequence of transitions of -A starting from s0. We are interested in B¨uchi objectives. Therefore, a run is accepting if there exists a final location <sup>t</sup> ∈ L<sup>t</sup> that the run visits infinitely often.

As done classically, we assume that every clock is bounded in A by a constant M, that is we only consider the previous infinite transition system over the subset L × [0, M] <sup>X</sup> of states.

We study the robustness problem introduced in [25], that is stated in terms of games where a controller fights against an environment. After a prefix of a run, the controller will have the capability to choose delays and transitions to fire, whereas the environment perturbs the delays chosen by the controller with a small parameter δ > 0. The aim of the controller will be to find a strategy so that, no matter how the environment plays, he is ensured to generate an infinite run satisfying the B¨uchi condition. Formally, given a timed automaton A = (L, 0,E,Lt) and δ > 0, the perturbation game is a two-player turn-based game Gδ(A) between a controller and an environment. Its state space is partitioned into <sup>S</sup><sup>C</sup> S<sup>E</sup> where <sup>S</sup><sup>C</sup> <sup>=</sup> <sup>L</sup>×R<sup>X</sup> -<sup>0</sup> belongs to the controller, and <sup>S</sup><sup>E</sup> <sup>=</sup> <sup>L</sup>×R<sup>X</sup> -<sup>0</sup>× <sup>R</sup><sup>&</sup>gt;<sup>0</sup> <sup>×</sup> <sup>E</sup> to the environment. The initial state is (0, **<sup>0</sup>**) <sup>∈</sup> <sup>S</sup><sup>C</sup> . From each state (, ν) ∈ S<sup>C</sup> , there is a transition to (, ν, d, e) ∈ S<sup>E</sup> with e = (, g,Y, ) ∈ E whenever d>δ, and ν + d + ε |= g for all ε ∈ [−δ, δ]. Then, from each state (, ν, d,(, g,Y, )) ∈ SE, there is a transition to ( ,(ν + d + ε)[r ← 0]) ∈ S<sup>C</sup> for all ε ∈ [−δ, δ]. A play of Gδ(A) is a finite or infinite path q<sup>0</sup> <sup>t</sup><sup>1</sup> −→ <sup>q</sup><sup>1</sup> <sup>t</sup><sup>2</sup> −→ <sup>q</sup><sup>2</sup> ··· where q<sup>0</sup> = (0, 0) and t<sup>i</sup> is a transition from state q<sup>i</sup>−<sup>1</sup> to qi, for all i > 0. It is said to be maximal if it is infinite or can not be extended with any transition.

A strategy for the controller is a function σCon mapping each non-maximal play ending in some (, ν) ∈ S<sup>C</sup> to a pair (d, e) where d > 0 and e ∈ E such that there is a transition from (, ν) to (, ν, d, e). A strategy for the environment is a function σEnv mapping each finite play ending in (, ν, d, e) to a state ( , ν ) related by a transition. A play gives rise to a unique run of -A by only keeping states in V<sup>C</sup> . For a pair of strategies (σCon, σEnv), we let play<sup>δ</sup> <sup>A</sup>(σCon, σEnv) denote the run associated with the unique maximal play of Gδ(A) that follows the strategies. Controller's strategy σCon is winning (with respect to the B¨uchi objective Lt) if for all strategies σEnv of the environment, play<sup>δ</sup> <sup>A</sup>(σCon, σEnv) is infinite and visits infinitely often some location of Lt. The *parametrised robust controller synthesis problem* asks, given a timed automaton A, whether there exists δ > 0 such that the controller has a winning strategy in Gδ(A).

*Example 1.* The controller has a winning strategy in Gδ(A), with A the automaton of Fig. 1, for all possible values of δ < 1/2. Indeed, he can follow the cycle <sup>0</sup> → <sup>3</sup> → <sup>0</sup> by always picking time delay 1/2 so that, when arriving in <sup>3</sup> (resp. 0) after the perturbation of the environment, clock x<sup>2</sup> (resp. x1) has a valuation in [1/2−δ, 1/2+δ]. Therefore, he can play forever following this memoryless strategy. For δ ≥ 1/2, the environment can enforce reaching <sup>3</sup> with a value for x<sup>2</sup> at least equal to 1. The guard x<sup>2</sup> < 2 of the next transition to <sup>0</sup> cannot be guaranteed, and therefore the controller cannot win Gδ(A). In [25], it is shown that the cycle around <sup>2</sup> does not provide a winning strategy for the controller for any value of δ > 0 since perturbations accumulate so that the controller can only play it a finite number of times in the worst case.

By [25], the parametrised robust controller synthesis problem is known to be PSPACE-complete. Their solution is based on the region automaton of A. We are seeking for a more practical solution using zones. A zone Z over X is a convex subset of R<sup>X</sup> -<sup>0</sup> defined as the set of valuations satisfying a clock constraint g, i.e. Z = g. Zones can be encoded into *difference-bound matrices (DBM)*, that are |X0| × |X0|-matrices over (<sup>R</sup> × {<, -}) ∪ {(∞, <)}. We adopt the following notation: for a DBM <sup>M</sup>, we write <sup>M</sup> = (M, <sup>≺</sup><sup>M</sup>), where <sup>M</sup> is the matrix made of the first components, with elements in <sup>R</sup> ∪ {∞}, while <sup>≺</sup><sup>M</sup> is the matrix of the second components, with elements in {<, -}. A DBM M naturally represents a zone (which we abusively write M as well), defined as the set of valuations ν such that, for all x, y ∈ X0, <sup>ν</sup>(x)−ν(y) <sup>≺</sup><sup>M</sup> x,y Mx,y (where ν(x0) = 0). Coefficients of a DBM are thus pairs (≺, c). As usual, these can be compared: (≺, c) is less than (≺ , c ) (denoted by (≺, c) < (≺ , c )) whenever c<c or (c = c , ≺ = < and ≺ = -). Moreover, these coefficients can be added: (≺, c)+(≺ , c ) is the pair (≺, c + c ) with ≺ = if ≺ = ≺ = and ≺ = < otherwise.

DBMs were introduced in [4,10] for analyzing timed automata; we refer to [3] for details. Standard operations used to explore the state space of timed automata have been defined on DBMs: intersection is written M ∩ N, Pretime>t(M) is the set of valuations such that a time delay of more than t time units leads to the zone M, UnresetR(M) is the set of valuations that end in M when the clocks in R are reset. From a robustness perspective, we also consider the operator shrink[−δ,δ](M) defined as the set of valuations ν such that ν + [−δ, δ] ⊆ M introduced in [24]. Given a DBM M and a rational number δ, all these operations can be effectively computed in time cubic in |X |.

#### **3 Reachability Relation of a Path**

Before treating the robustness issues, we start by designing a symbolic (i.e. zonebased) approach to describe and compare the reachability relations of paths in timed automata. This will be crucial subsequently to design a termination criterion in the state space exploration of our robustness-checking algorithm. Solving the inclusion of reachability relations in a symbolic manner has independent interest and can have other applications.

The reachability relation Reach(ρ) of a path ρ, is a subset of RX ∪X - -<sup>0</sup> where X are primed versions of the clocks, such that each (ν, ν ) ∈ Reach(ρ) iff there is a run from valuation ν to valuation ν following ρ. Unfortunately, reachability relations Reach(ρ) are not zones in general, that is, they cannot be represented using only difference constraints. In fact, we shall see shortly that constraints of the form x − y + z − u c also appear, as already observed in [22]. We thus cannot rely directly on the traditional difference bound matrices (DBMs) used to represent zones. We instead rely on the constraint graphs that were introduced in [8], and explored in [16] for the parametric case (the latter work considers enlarged constraints, and not shrunk ones as we study here). Our contribution is to use these graphs to obtain a syntactic check of inclusion of the according reachability relations.

**Constraint Graphs.** Rather than considering the values of the clocks in X , this data structure considers the date X<sup>i</sup> of the latest reset of the clock xi, and uses a new variable τ denoting the global timestamp. Note that the clock values can be recovered easily since X<sup>i</sup> = τ − xi. For the extra clock x0, we introduce variable X<sup>0</sup> equal to the global timestamp τ (since x<sup>0</sup> must remain equal to 0). A constraint graph defining a zone is a weighted graph whose nodes are X = {X0, X1,...,Xn}. Constraints on clocks are represented by weights on edges in the graph: a constraint X − Y ≺ c is represented by an edge from X to Y weighted by (≺, c), with ≺∈{-, <} and c ∈ **Q**. Weights in the graph are thus pairs of the form (≺, c). Therefore, we can compute shortest weights between two vertices of a weighted graph. A cycle is said to be negative if it has weight at most (<, 0), i.e. (<, 0) or (≺, c) with c < 0.

**Encoding Paths.** Constraint graphs can also encode tuples of valuations seen along a path. To encode a k-step computation, we make k + 1 copies of the nodes, that is, <sup>X</sup><sup>i</sup> <sup>=</sup> {X<sup>i</sup> 0, X<sup>i</sup> 1,...,X<sup>i</sup> <sup>n</sup>} for i ∈ {1,...,k + 1}. These copies are also called *layers*. Let us first consider an example on the path ρ consisting of the edge from <sup>1</sup> to 2, and the edge from <sup>2</sup> to 1, in the timed automaton of Fig. 1. The constraint graph G<sup>ρ</sup> is depicted in Fig. 3: in our diagrams of constraint graphs, the absence of labels on an edge means (-, 0), and we depict with an edge with arrows on both ends the presence of an edge in both directions. The graph has five columns, each containing copies of the variables for that step: they represent the valuations before the first edge, after the first time elapse, after the first reset, after the second time elapse and after the second reset. In general now, each elementary operation can be described by a constraint graph with two layers (Xi) (before) and (X <sup>i</sup> ) (after).

– The operation Pretime>t is described by the constraint graph G>t time with edges X<sup>i</sup> → X0, X<sup>i</sup> ↔ X <sup>i</sup> for i > 0, and X<sup>0</sup> (<,−t) −−−−→ <sup>X</sup> <sup>0</sup>. Figure 3 contains two occurrences of G<sup>&</sup>gt;<sup>0</sup> time: we always represent with dashed arrows edges that are labelled by (<, c), and plain arrows edges that are labelled with (-, c); the absence of an edge means that it is labelled with (<,∞).

– The operation g ∩ Unreset<sup>Y</sup> (·), to test a guard g and reset the clocks in Y, is described by the constraint graph Gg,<sup>Y</sup> edge with edges X<sup>0</sup> ↔ X <sup>0</sup> (meaning that the time does not elapse), X<sup>i</sup> ↔ X <sup>i</sup> for i such that clock x<sup>i</sup> ∈ Y / , and X <sup>i</sup> ↔ X <sup>0</sup> for i such that clock x<sup>i</sup> ∈ Y, and for all clock constraint x<sup>i</sup> −x<sup>j</sup> ≺ c appearing in g, an edge from X<sup>j</sup> to X<sup>i</sup> labelled by (≺, c) (since it encodes the fact that (τ − Xi) − (τ − X<sup>j</sup> ) = X<sup>j</sup> − X<sup>i</sup> ≺ c). In Fig. 3, we have first Gx12,{x1} edge , and then <sup>G</sup>x2-2,{x2} edge .

Constraint graphs can be stacked one after the other to obtain the constraint graph of an edge e, and then of a path ρ, that we denote by Gρ. In the resulting graph, there is one leftmost layer of vertices (X <sup>i</sup> )<sup>i</sup> and one rightmost one (X<sup>r</sup> <sup>i</sup> )<sup>i</sup> representing the situation before and after the firing of the path ρ. Once this graph is constructed, the intermediary levels can be eliminated after replacing each edge between the nodes of <sup>X</sup> <sup>∪</sup> <sup>X</sup><sup>r</sup> by the shortest path in the graph. This phase is hereafter called *normalisation* of the constraint graph. The normalised version of the constraint graph of Fig. 3 is depicted on its right.

**From Constraint Graphs to Reachability Relations.** From a logical point of view, the elimination of intermediary layers reflects an elimination of quantifiers in a formula of the first-order theory of real numbers. At the end, we obtain a set of constraints of the form X<sup>k</sup> <sup>i</sup> <sup>−</sup> <sup>X</sup><sup>k</sup>- <sup>j</sup> ≺ c with k, k ∈ {, r}. These constraints do not reflect uniquely the reachability relation Reach(ρ), in the sense that it is possible that Reach(ρ1) = Reach(ρ2) but the normalised versions of G<sup>ρ</sup><sup>1</sup> and G<sup>ρ</sup><sup>2</sup> are different. For example, if we consider the path ρ<sup>2</sup> obtained by repeating the cycle ρ between <sup>1</sup> and 2, the reachability relation does not change (Reach(ρ<sup>2</sup>) = Reach(ρ)), but the normalised constraint graph does (Gρ<sup>2</sup> <sup>=</sup> <sup>G</sup>ρ<sup>1</sup> ): all labels (-, 2) of the red dotted edges from the rightmost layer to the leftmost layer become (-, 4), and the labels (-, −2) of the dashed blue edges become (-, −4).

We solve this issue by jumping back from variables X<sup>k</sup> <sup>i</sup> to the clock valuations. Indeed, in terms of clock valuations ν and ν<sup>r</sup> before and after the path, the constraint X<sup>k</sup> <sup>i</sup> <sup>−</sup> <sup>X</sup><sup>k</sup>- <sup>j</sup> <sup>≺</sup> <sup>c</sup> (for k, k ∈ {l, r}) rewrites as (<sup>τ</sup> <sup>k</sup> <sup>−</sup> <sup>ν</sup><sup>k</sup>(xi)) <sup>−</sup> (<sup>τ</sup> <sup>k</sup>- − ν<sup>k</sup>- (x<sup>j</sup> )) <sup>≺</sup> <sup>c</sup>, where <sup>τ</sup> is the global timestamp before firing <sup>ρ</sup> and <sup>τ</sup> <sup>r</sup> the one after. When k = k , variables τ and τ <sup>r</sup> disappear, leaving a constraint of the form <sup>ν</sup><sup>k</sup>(x<sup>j</sup> ) <sup>−</sup> <sup>ν</sup><sup>k</sup>(xi) <sup>≺</sup> <sup>c</sup>. When <sup>k</sup> <sup>=</sup> <sup>k</sup> , we can rewrite the constraint as <sup>τ</sup> <sup>k</sup> <sup>−</sup> <sup>τ</sup> <sup>k</sup>- ≺ <sup>ν</sup><sup>k</sup>(xi) <sup>−</sup> <sup>ν</sup><sup>k</sup>- (x<sup>j</sup> ) + c. We therefore obtain upper and lower bounds on the value of <sup>τ</sup> <sup>r</sup> <sup>−</sup> <sup>τ</sup> , allowing us to eliminate <sup>τ</sup> <sup>r</sup> <sup>−</sup> <sup>τ</sup> considered as a single variable. We therefore obtain in fine a formula mixing constraints of the form


<sup>ν</sup>r(xc) <sup>−</sup> <sup>ν</sup>r(xd) <sup>≺</sup><sup>2</sup> <sup>p</sup>2. Thus, <sup>γ</sup>a,b,c,d is obtained as the minimum of the two constraints obtained in this manner. In other terms, in the constraint graph, this constraint is the minimal weight between the sum of the weights of the edges (X<sup>r</sup> <sup>d</sup> , X<sup>l</sup> a) and (X<sup>l</sup> b, X<sup>r</sup> <sup>c</sup> ), and the sum of the weights of the edges (X<sup>l</sup> b, X<sup>l</sup> a) and (X<sup>r</sup> <sup>d</sup> , X<sup>r</sup> <sup>c</sup> ). For example, in the path in Fig. 3, we have γ0,1,0,<sup>2</sup> = (-, 0) since the two constraints are (-, 0) and (<,∞), whereas γ1,2,2,<sup>1</sup> = (-, 0) because the two constraints are (<, 2) and (-, 0).

Let ϕ(G) be the conjunction of such constraints obtained from a constraint graph G once normalised: this is a quantifier-free formula of the additive theory of reals. We obtain the following property whose proof mimics the one for proving the normalisation of DBMs (and can be derived from the developments of [8]).

**Lemma 1.** *Let* ρ *be a path in a timed automaton. If* G<sup>ρ</sup> *contains a negative cycle, then* Reach(ρ) = ∅*. Otherwise,* Reach(ρ) *is the set of pairs of valuations* (ν, ν<sup>r</sup>) *that satisfy the formula* ϕ(Gρ)*.*

**Checking Inclusion.** For a path ρ, we regroup the pairs (γ<sup>l</sup> a,b), (γ<sup>r</sup> a,b) and (γa,b,c,d) above in a single vector Γ<sup>ρ</sup>. We extend the comparison relation < to these vectors by applying it componentwise. These vectors can be used to check equality or inclusion of reachability relations in time O(|X| 4):

**Theorem 1.** *Let* ρ *and* ρ *be paths in a timed automaton such that* Reach(ρ) *and* Reach(ρ ) *are non empty. Then* Reach(ρ) ⊆ Reach(ρ ) *if and only if* Γ<sup>ρ</sup> - Γ<sup>ρ</sup>- *.*

Notice that we do not need to check equivalence or implication of formulas ϕ(Gρ) and ϕ(G<sup>ρ</sup>- ), but simply check syntactically constants appearing in these formulas. Moreover, these constants can be stored in usual DBMs on 2 × |X0| clocks, allowing for reusability of classical DBM libraries. For the constraint graph in Fig. 3, we have seen that <sup>G</sup>ρ<sup>2</sup> <sup>=</sup> <sup>G</sup>ρ<sup>1</sup> , even if Reach(ρ<sup>2</sup>) = Reach(ρ). However, we can check that ϕ(Gρ<sup>2</sup> ) = ϕ(Gρ) as expected.

**Computation of Pre and Post.** By Lemma 1 and the construction of constraint graphs, one can easily compute Preρ(Z) = {ν | ∃ν ∈ Z ((, ν),( , ν )) ∈ Reach(ρ)} for a given path ρ and zone Z (see [8,16]). In fact, consider the normalised constraint graph <sup>G</sup><sup>ρ</sup> on nodes <sup>X</sup> <sup>∪</sup> <sup>X</sup><sup>r</sup>. To compute Preρ(Z), one just needs to add the constraints of Z on X<sup>r</sup>. This is done by replacing each edge X<sup>r</sup> i w −→ <sup>X</sup><sup>r</sup> <sup>j</sup> by X<sup>r</sup> i min(Z*j,i*,w) −−−−−−−→ <sup>X</sup><sup>r</sup> <sup>j</sup> where Zj,i = (≺, p) defines the constraint of Z on x<sup>j</sup> − xi. Then, the normalisation of the graph describes the reachability relation along path ρ ending in zone Z. Furthermore, projecting the constraints to X yields Preρ(Z): this can be obtained by gathering all constraints on pairs of nodes of X. A reachability relation can thus be seen as a function assigning to each zone Z its image by ρ. One can symmetrically compute the successor Postρ(Z) = {ν | ∃ν ∈ Z ((, ν),( , ν )) ∈ Reach(ρ)} by constraining the nodes X and projecting to X<sup>r</sup>.

#### **4 Robust Iterability of a Lasso**

In this section, we study the perturbation game Gδ(A) between the two players (controller and environment), as defined in Sect. 2, when the timed automaton A is restricted to a fixed *lasso* ρ1ρ2, i.e. ρ<sup>1</sup> is a path from <sup>0</sup> to some accepting location t, and ρ<sup>2</sup> a cyclic path around t. This implies that the controller does not have the choice of the transitions, but only of the delays. We will consider different settings, in which δ is fixed or not.

**Controllable Predecessors and their Greatest Fixpoints.** Consider an edge e = (, g, R, ). For any set <sup>Z</sup> <sup>⊆</sup> <sup>R</sup><sup>X</sup> -<sup>0</sup>, we define the *controllable predecessors of* Z as follows: CPre<sup>δ</sup> <sup>e</sup>(Z) = Pretime>δ(shrink[−δ,δ](g ∩ UnresetR(Z))). Intuitively, CPre<sup>δ</sup> <sup>e</sup>(Z) is the set of valuations from which the controller can ensure reaching Z in one step, following the edge e, no matter of the perturbations of amplitude at most δ of the environment. In fact, it can delay in shrink[−δ,δ](g ∩ UnresetR(Z)) with a delay of at least δ, where under any perturbation in [−δ, δ], the valuation satisfies the guard, and it ends, after reset, in Z. Results of [24] show that this operator can be computed in cubic time with respect to the number of clocks. We extend this operator to a path ρ by composition, denoted it by CPre<sup>δ</sup> <sup>ρ</sup>. Note that CPre<sup>0</sup> <sup>ρ</sup> = Pre<sup>ρ</sup> is the usual predecessor operator without perturbation.

This operator is monotone, hence its greatest fixpoint νX CPre<sup>δ</sup> ρ(X) is welldefined, equal to - i-<sup>0</sup> CPre<sup>δ</sup> <sup>ρ</sup>*<sup>i</sup>* (): it corresponds to the valuations from which the controller can guarantee to loop forever along the path ρ. By definition of the game Gδ(A) where A is restricted to the lasso ρ1ρ2, the controller wins the game if and only if **<sup>0</sup>** <sup>∈</sup> CPre<sup>δ</sup> <sup>ρ</sup><sup>1</sup> (νX CPre<sup>δ</sup> <sup>ρ</sup><sup>2</sup> (X)). As a consequence, our problem reduces to the computation of this greatest fixpoint.

**Branching Constraint Graphs.** We consider first a fixed (rational) value of the parameter δ, and are interested in the computation of the greatest fixpoint νX CPre<sup>δ</sup> <sup>ρ</sup><sup>2</sup> (X). In [16], constraints graphs were used to provide a termination criterion allowing to compute the greatest fixpoint of the classical predecessor operator CPre<sup>0</sup> <sup>ρ</sup>. We generalize this approach to deal with the operator CPre<sup>δ</sup> ρ and to this end, we need to generalize constraint graphs so as to encode it. Unfortunately, the operator shrink[−δ,δ] cannot be encoded in a constraint graph. Intuitively, this comes from the fact that a constraint graph represents a relation between valuations, while there is no such relation associated with the CPre<sup>δ</sup> ρ operator. Instead, we introduce *branching constraint graphs*, that will faithfully represent the CPre<sup>δ</sup> <sup>ρ</sup> operator: unlike constraint graphs introduced so far that have a left layer and a right layer of variables, a branching constraint graph has still a single left layer but several right layers.

We first define a branching constraint graph G<sup>δ</sup> shrink associated with the operator shrink[−δ,δ] as follows. Its set of vertices is composed of three copies of the {X0, X1,...,Xn}, denoted by primed, unprimed and doubly primed versions. Edges are defined so as to encode the following constraints : X <sup>i</sup> = X<sup>i</sup> and X <sup>i</sup> = X<sup>i</sup> for every i = 0, and X <sup>0</sup> = X<sup>0</sup> + δ and X <sup>0</sup> = X<sup>0</sup> − δ. An instance of this graph can be found in several occurrences in Fig. 2.

**Proposition 1.** *Let* Z *be a zone and* G<sup>δ</sup> shrink(Z) *be the graph obtained from* G<sup>δ</sup> shrink *by adding on primed and doubly primed vertices the constraints defining* Z *(as for* Preρ(Z) *in the end of Sect. 3). Then the constraint on unprimed vertices obtained from the shortest paths in* G<sup>δ</sup> shrink(Z) *is equivalent to* shrink[−δ,δ](Z)*.*

*Proof.* Given a zone Z and a real number d, we define Z + d = {ν + d | ν ∈ Z}. One easily observes that shrink[−δ,δ](Z)=(Z + δ) ∩ (Z − δ). The result follows from the observation that taking two distinct copies of vertices, and considering shortest paths allows one to encode the intersection.

Then, for all edges e = (, g, R, ), we define the branching constraint graph G<sup>δ</sup> <sup>e</sup> as the graph obtained by stacking (in this order) the branching constraint graph G>δ time, G<sup>δ</sup> shrink and <sup>G</sup>g,<sup>Y</sup> edge. Note that two copies of the graph <sup>G</sup>g,<sup>Y</sup> edge are needed, to be connected to the two sets of vertices that are on the right of the graph G<sup>δ</sup> shrink. This definition is extended in the expected way to a finite path ρ, yielding the graph G<sup>δ</sup> ρ. In this graph, there is a single set of vertices on the left, and 2|ρ<sup>|</sup> sets of vertices on the right. As a direct consequence of the previous results on the constraint graphs for time elapse, shrinking and guard/reset, one obtains:

**Proposition 2.** *Let* Z *be a zone and* ρ *be a path. We let* G<sup>δ</sup> <sup>ρ</sup>(Z) *be the graph obtained from* G<sup>δ</sup> <sup>ρ</sup> *by adding on every set of right vertices the constraints defining* Z*. Then the constraint on the left layer of vertices obtained from the shortest paths in* G<sup>δ</sup> <sup>ρ</sup>(Z) *is equivalent to* CPre<sup>δ</sup> <sup>ρ</sup>(Z)*.*

An example of the graph G<sup>δ</sup> <sup>ρ</sup>(Z) for ρ = e1e2, edges considered in Fig. 3, is depicted in Fig. 2 (on the left).

**Fig. 2.** On the left, the branching constraint graph <sup>G</sup><sup>δ</sup> <sup>e</sup>1e<sup>2</sup> encoding the operator CPre<sup>δ</sup> <sup>e</sup>1e<sup>2</sup> , where e<sup>1</sup> and e<sup>2</sup> refer to edges considered in Fig. 3. Dashed edges have weight (<, .), plain edges have weight (-, .). Black edges (resp. orange edges, pink edges, red edges, blue edges) are labelled by (., 0) (resp. (., −δ), (., δ), (., 2),(., −2)). On the right, a decomposition of a path in a branching constraint graph G<sup>δ</sup> <sup>ρ</sup>. (Color figure online)

We are now ready to prove the following result, generalisation of [16, Lemma 2], that will allow us to compute the greatest fixpoint of the operator CPre<sup>δ</sup> ρ:

**Fig. 3.** On the left, the constraint graph of the path <sup>1</sup> <sup>x</sup>12,x1:=0 −−−−−−−→ <sup>2</sup> x2-<sup>2</sup>,x2:=0 −−−−−−−→ 1. On the right, its normalised version: dashed edges have weight (<, .), plain edges have weight (-, .), black edges have weight (., 0), red edges have weight (., 2) and blue edges have weight (., −2).

**Proposition 3.** *Let* ρ *be a path and* δ *be a non-negative rational number. We let* N = |X0| <sup>2</sup>*. If* CPre<sup>δ</sup> <sup>ρ</sup>2*N*+1 () - CPre<sup>δ</sup> <sup>ρ</sup>2*<sup>N</sup>* ()*, then* νX CPre<sup>δ</sup> <sup>ρ</sup>(X) = ∅*.*

*Proof.* Assume CPre<sup>δ</sup> <sup>ρ</sup>2*N*+1 () - CPre<sup>δ</sup> <sup>ρ</sup>2*<sup>N</sup>* () and consider the zones CPre<sup>δ</sup> <sup>ρ</sup>*N*+1 () (represented by the DBM M1) and CPre<sup>δ</sup> <sup>ρ</sup>*<sup>N</sup>* () (represented by the DBM M2). We have M<sup>1</sup> - M2, as otherwise the fixpoint would have already been reached after N steps. By Proposition 2, the zone corresponding to M<sup>1</sup> is associated with shortest paths between vertices on the left in the graph G<sup>δ</sup> <sup>ρ</sup>*N*+1 . In the sequel, given a path r in this graph, w(r) denotes its weight. We distinguish two cases:

**Case 1:** M<sup>1</sup> - M<sup>2</sup> because of the rational coefficients. Then, there exists an entry (x, y) ∈ X <sup>2</sup> <sup>0</sup> such that M1[x, y] < M2[x, y]. The value M1[x, y] is thus associated with a shortest path between vertices X and Y in G<sup>δ</sup> <sup>ρ</sup>*N*+1 . We fix a shortest path of minimal length, and denote it by r. As the entry is strictly smaller than in M2, this shortest path should reach the last copy of the graph G<sup>δ</sup> <sup>ρ</sup>. This path can be interpreted as a traversal of the binary tree of depth |X0| <sup>2</sup> + 1, reaching at least one leaf. We can prove that this entails that there exists a pair of clocks (u, v) ∈ X <sup>2</sup> <sup>0</sup> appearing at two levels i<j of this tree, and a decomposition r = r1r2r3r4r<sup>5</sup> of the path, such that w(r2) + w(r4)=(≺, d) with d < 0 (Property (†)). In addition, in this decomposition, r<sup>3</sup> is included in subgraphs of levels k ≥ j, and the pair of paths (r2, r4) is called a *return path*, following the terminology of [16]. This decomposition is depicted in Fig. 2 (on the right). Intuitively, the property (†) follows from the fact that as r<sup>3</sup> is included in subgraphs of levels k ≥ j, and because the final zone (on the right) is the zone which adds no edges, the concatenation r = r1r3r<sup>5</sup> is also a valid path from X to Y in G<sup>δ</sup> <sup>ρ</sup>*N*+1 , and is shorter than r. We conclude using the fact that r has been chosen as a shortest path of minimal weight.

Property (†) allows us to prove that the greatest fixpoint is empty. Indeed, by considering iterations of ρ, one can repeat the return path associated with (r2, r4) and obtain paths from X to Y whose weights diverge towards −∞.

**Case 2:** M<sup>1</sup> - M<sup>2</sup> because of the ordering coefficients. We claim that this case cannot occur. Indeed, one can show that the constants will not evolve anymore after the Nth iteration of the fixpoint: the coefficients can only decrease by changing from a non-strict inequality (≤, c) to a strict one (<, c). This propagation of strict inequalities is performed in at most |X0| <sup>2</sup> additional steps, thus we have CPre<sup>δ</sup> <sup>ρ</sup>2*N*+1 () = CPre<sup>δ</sup> <sup>ρ</sup>2*<sup>N</sup>* (), yielding a contradiction.

Compared to the result of [16], the number of iterations needed before convergence grows from |X0| <sup>2</sup> to 2|X0<sup>|</sup> <sup>2</sup>: this is due to the presence of strict and non-strict inequalities, not considered in [16]. With the help of branching constraint graphs, we have thus shown that the greatest fixpoint can be computed in finite time: this can then be done directly with computations on zones (and not on branching constraint graphs).

**Proposition 4.** *Given a path* ρ *and a rational number* δ*, the greatest fixpoint* νX CPre<sup>δ</sup> <sup>ρ</sup>(X) *can be computed in time polynomial in* |X | *and* |ρ|*. As a consequence, one can decide whether the controller has a strategy along a lasso* ρ1ρ<sup>2</sup> *in* Gδ(A) *in time polynomial in* |X | *and* |ρ1ρ2|*.*

**Solving the Robust Controller Synthesis Problem for a Lasso.** We have shown how to decide whether the controller has a winning strategy for a fixed rational value of δ. We now aim at deciding whether there exists a positive value of δ for which the controller wins the game Gδ(A) (where A is restricted to a lasso ρ1ρ2). To this end, we will use a parametrised extension of DBMs, namely *shrunk DBMs*, that were introduced in [24] in order to study the parametrised state space of timed automata. Intuitively, our goal is to express *shrinkings* of guards, e.g. sets of states satisfying constraints of the form g = 1+ δ<x< 2 − δ ∧ 2δ<y, where δ is a parameter to be chosen. Formally, a shrunk DBM is a pair (M,P), where M is a DBM, and P is a nonnegative integer matrix called a *shrinking matrix*. This pair represents the set of valuations defined by the DBM M − δP, for any given δ > 0. Considering the example g, M is the guard g obtained by setting δ = 0, and P is made of the integer multipliers of δ. We adopt the following notation: when we write a statement involving a shrunk DBM (M,P), we mean that for some δ<sup>0</sup> > 0, the statement holds for M − δP for all δ ∈ (0, δ0]. For instance, (M,P) = Pretime>δ((N,Q)) means that M − δP = Pretime>δ(N − δQ) for all small enough δ > 0. Shrunk DBMs are closed under standard operations on zones, and as a consequence, the CPre operator can be computed on shrunk DBMs:

**Lemma 2.** ([25]) *Let* e = (, g, R, ) *be an edge and* (M,P) *be a shrunk DBM. Then, there exists a shrunk DBM* (N,Q)*, that we can compute in polynomial time, such that* (N,Q) = CPre<sup>δ</sup> <sup>e</sup>((M,P))*.*

**Proposition 5.** *Given a path* ρ*, one can compute a shrunk DBM* (M,P) *equal to the greatest fixpoint of the operator* CPre<sup>δ</sup> <sup>ρ</sup>*. As a consequence, one can solve the parametrised robust controller synthesis problem for a given lasso in time complexity polynomial in the number of clocks and in the length of the lasso.*

*Proof.* The bound 2|X0| <sup>2</sup> identified previously does not depend on the value of δ. Hence the algorithm for computing a shrunk DBM representing the greatest fixpoint proceeds as follows. It computes symbolically, using shrunk DBMs, the 2|X0| <sup>2</sup>-th and 2|X0<sup>|</sup> <sup>2</sup> + 1-th iterations of the operator CPre<sup>δ</sup> <sup>ρ</sup>, from the zone . By monotonicity, the 2|X0| <sup>2</sup> + 1-th iteration is included in the 2|X0<sup>|</sup> <sup>2</sup>-th. If the two shrunk DBMs are equal, then they are also equal to the greatest fixpoint. Otherwise, the greatest fixpoint is empty. To decide the robust controller synthesis problem for a given lasso, one first computes a shrunk DBM representing the greatest fixpoint associated with ρ<sup>2</sup> and, if not empty, one computes a new shrunk DBM by applying to it the operator CPre<sup>δ</sup> <sup>ρ</sup><sup>1</sup> . Then, one checks whether the valuation **0** belongs to the resulting shrunk DBM.

**Computing the Largest Admissible Perturbation.** We say that a perturbation δ is *admissible* if the controller wins the game Gδ(A). The parametrised robust controller synthesis problem, solved before just for a lasso, aims at deciding whether *there exists* a positive admissible perturbation. A more ambitious problem consists in determining the *largest admissible* perturbation.

The previous algorithm performs a bounded (2|X0| <sup>2</sup>) number of computations of the CPre<sup>δ</sup> <sup>ρ</sup> operator. Instead of focusing on arbitrarily small values using shrunk DBMs as we did previously, we must perform a computation for all values of δ. To do so, we consider an extension of the (shrunk) DBMs in which each entry of the matrix (which thus represents a clock constraint) is a piecewise affine function of δ. One can observe that all the operations involved in the computation of the CPre<sup>δ</sup> <sup>ρ</sup> operator can be performed symbolically w.r.t. δ using piecewise affine functions. As a consequence, we obtain the following new result:

#### **Proposition 6.** *We can compute the largest admissible perturbation of a lasso.*

*Proof.* Let ρ1ρ<sup>2</sup> be a lasso. One first computes a symbolic representation, valid for all values of δ, of the greatest fixpoint of CPre<sup>δ</sup> <sup>ρ</sup><sup>2</sup> . To do so, one computes the 2|X0| <sup>2</sup>-th and 2|X0<sup>|</sup> <sup>2</sup>+1-th iterations of this operator, from the zone . We denote them by M<sup>1</sup> and M<sup>2</sup> respectively. By monotonicity, the inclusion M1(δ) ⊆ M2(δ) holds for every δ ≥ 0. In addition, both M<sup>1</sup> and M<sup>2</sup> are decreasing w.r.t. δ, thus one can identify the value <sup>δ</sup><sup>0</sup> = inf{<sup>δ</sup> <sup>≥</sup> <sup>0</sup> <sup>|</sup> <sup>M</sup>1(δ) - M2(δ)}. Then, the greatest fixpoint is equal to M<sup>1</sup> for δ<δ0, and to the emptyset for δ at least δ0. As a second step, one applies the operator CPre<sup>ρ</sup><sup>1</sup> to the greatest fixpoint. We denote the result by M. To conclude, one can then compute and return the value sup{δ ∈ [0, δ0[ | **0** ∈ M(δ)} of maximal perturbation.

#### **5 Synthesis of Robust Controllers**

We are now ready to solve the parametrised robust controller synthesis problem, that is to find, if it exists, a lasso ρ1ρ<sup>2</sup> and a perturbation δ such that the controller wins the game Gδ(A) when following the lasso ρ1ρ<sup>2</sup> as a strategy. As for the symbolic checking of emptiness of a B¨uchi timed language [17], we will use a double forward analysis to exhaust all possible lassos, each being tested for robustness by the techniques studied in previous section: a first forward analysis will search for ρ1, a path from the initial location to an accepting location, and a second forward analysis from each accepting location to find the cycle ρ<sup>2</sup> around . Forward analysis means that we compute the successor zone Postρ(Z) when following path ρ from zone Z.

**Abstractions of Lassos.** Before studying in more details the two independent forward analyses, we first study what information we must keep about ρ<sup>1</sup> and ρ<sup>2</sup> in order to still being able to test the robustness of the lasso ρ1ρ2. A classical problem for robustness is the firing of a *punctual transition*, i.e. a transition where controller has a single choice of time delay: clearly such a firing will be robust for no possible choice of parameter δ. Therefore, we must at least forbid such punctual transitions in our forward analyses. We thus introduce a non-punctual successor operator Postnp <sup>ρ</sup> . It consists of the standard successor operator Post<sup>ρ</sup> in the timed automaton <sup>A</sup>np obtained from <sup>A</sup> by making strict every constraint appearing in the guards (1 ≤ x ≤ 2 becomes 1 <x< 2). The crucial point is that if a positive delay d can be taken by the controller while satisfying a set of strict constraints, then other delays are also possible, close enough to d. By analogy, a region is said to be *non-punctual* if it contains two valuations separated by a positive time delay. In particular, if such a region satisfies a constraint in A it also satisfies the corresponding strict constraint in <sup>A</sup>np. Therefore, controller wins <sup>G</sup>δ(A) for some δ > 0 if and only if he wins <sup>G</sup>δ(Anp) for some δ > 0.

The link between non-punctuality and robustness is as follows:

**Theorem 2.** *Let* ρ1ρ<sup>2</sup> *be a lasso of the timed automaton. We have*

$$\exists \delta > 0 \; \mathbf{0} \in \mathsf{CPr}\_{p\_1}^{\delta}(\nu X \, \mathsf{CPr}\_{p\_2}^{\delta}(X)) \iff \mathsf{Post}\_{p\_1}^{\mathrm{np}}(\mathbf{0}) \cap (\bigcup\_{\delta > 0} \nu X \, \mathsf{CPr}\_{p\_2}^{\delta}(X)) \neq \emptyset$$

*Proof.* The proof of this theorem relies on three main ingredients:


Therefore, in order to test the robustness of the lasso ρ1ρ2, it is enough to only keep in memory the sets Postnp <sup>ρ</sup><sup>1</sup> (**0**) and δ><sup>0</sup> νX CPre<sup>δ</sup> <sup>ρ</sup><sup>2</sup> (X).

**Non-punctual Forward Analysis.** As a consequence of the previous theorem, we can use a classical forward analysis of the timed automaton <sup>A</sup>np to look for the prefix ρ<sup>1</sup> of the lasso ρ1ρ2. A classical inclusion check on zones allows to stop the exploration, this criterion being complete thanks to Theorem 2. It is worth reminding that we consider only bounded clocks, hence the number of reachable zones is finite, ensuring termination.

**Robust Cycle Search.** We now perform a second forward analysis, from each possible final location, to find a robust cycle around it. To this end, for each cycle ρ2, we must compute the zone δ><sup>0</sup> νX CPre<sup>δ</sup> <sup>ρ</sup><sup>2</sup> (X). This computation is obtained by arguments developed in Sect. 4 (Proposition 4). To enumerate cycles ρ2, we can again use a classical forward exploration, starting from the universal zone . Using zone inclusion to stop the exploration is not complete: considering a path ρ <sup>2</sup> reaching a zone Z <sup>2</sup> included in the zone Z<sup>2</sup> reachable using some ρ2, ρ 2 could be robustly iterable while ρ<sup>2</sup> is not. In order to ensure termination of our analysis, we instead use reachability relations inclusion checks. These tests are performed using the technique developed in Sect. 3, based on constraint graphs (Theorem 1). The correction of this inclusion check is stated in the following lemma, where Reachnp <sup>ρ</sup> denotes the reachability relation associated with ρ in the automaton <sup>A</sup>np. This result is derived from the analysis based on regions in [25]. Indeed, we can prove that the non-punctual reachability relation we consider captures the existence of non-punctual aperiodic paths in the region automaton, as considered in [25].

**Lemma 3.** *Let* ρ<sup>1</sup> *a path from* <sup>0</sup> *to some target location* t*. Let* ρ2, ρ <sup>2</sup> *be two paths from* <sup>t</sup> *to some location , such that* Reachnp <sup>ρ</sup><sup>2</sup> <sup>⊆</sup> Reachnp ρ- 2 *. For all paths* ρ<sup>3</sup> *from to* t*,* Postnp <sup>ρ</sup><sup>1</sup> (**0**) ∩ ( δ><sup>0</sup> νX CPre<sup>δ</sup> <sup>ρ</sup>2ρ<sup>3</sup> (X)) <sup>=</sup> <sup>∅</sup> *implies* Postnp <sup>ρ</sup><sup>1</sup> (**0**) ∩ ( δ><sup>0</sup> νX CPre<sup>δ</sup> ρ- <sup>2</sup>ρ<sup>3</sup> (X)) = ∅.

#### **6 Case Study**

We implemented our algorithm in C++. To illustrate our approach, we present a case study on the regulation of train networks. Urban train networks in big cities are often particularly busy during rush hours: trains run in high frequency so even small delays due to incidents or passenger misbehavior can perturb the traffic and end up causing large delays. Train companies thus apply regulation techniques: they slow down or accelerate trains, and modify waiting times in order to make sure that the traffic is fluid along the network. Computing robust schedules with provable guarantees is a difficult problem (see e.g. [9]).

We study here a simplified model of a train network and aim at automatically synthesizing a controller that regulates the network despite perturbations, in order to ensure performance measures on total travel time for each train. Consider a circular train network with m stations s0,...,s<sup>m</sup>−<sup>1</sup> and n trains. We require that all trains are at distinct stations at all times. There is an interval of delays [i, ui] attached to each station which bounds the travel time from s<sup>i</sup> to si+1 mod <sup>m</sup>. Here the lower bound comes from physical limits (maximal allowed speed, and travel distance) while the upper bound comes from operator specification (e.g. it is not desirable for a train to remain at station for more than 3 min). The objective of each train i is to cycle on the network while completing each tour within a given time interval [t i 1, t<sup>i</sup> 2].

All timing requirements are naturally encoded with clocks. Given a model, we solve the robust controller synthesis problem in order to find a controller choosing travel times for all trains ensuring a B¨uchi condition (visiting s<sup>1</sup> infinitely often). Given the fact that trains cannot be at the same station at any given time, it suffices to state the B¨uchi condition only for one train, since its satisfaction of the condition necessarily implies that of all other trains.

Let us present two representative instances and then comment the performance of the algorithm on a set of instances. Consider a network with two trains and m stations, with [i, ui] = [200, 400] for each station i, and the objective of both trains is the interval [250 · m, 350 ·m], that is, an average travel time between stations that lies in [250, 350]. The algorithm finds an accepting lasso: intuitively, by choosing δ small enough so that mδ < 50, perturbations do not accumulate too much and the controller can always choose delays for both trains and satisfy the constraints. This case corresponds to scenario A in Fig. 4. Consider now the same network but with two different objectives: [0, 300·m] and [300·m,∞). Thus, one train needs to complete each


**Fig. 4.** Summary of experiments with different sizes. In each scenario, we assign a different objective to a subset of trains. The answer is *yes* if a robust controller was found, *no* if none exists. TO stands for a time-out of 30 min.

cycle in at most 300 · m time units, while the other one in at least 300 · m time units. A classical B¨uchi emptiness check reveals the existence of an accepting lasso: it suffices to move each train in exactly 300 time units between each station. This controller can even recover from perturbations for a bounded number of cycles: for instance, if a train arrives late at a station, the next travel time can be chosen smaller than 300. However, such corrections will cause the distance between the two trains to decrease and if such perturbations happen regularly, the system will eventually enter a deadlock. Our algorithm detects that there is no robust controller for the B¨uchi objective. This corresponds to the scenario B in Fig. 4.

Figure 4 summarizes the outcome of our prototype implementation on other scenarios. The tool was run on a 3.2 Ghz Intel i7 processor running Linux, with a 30 min time out and 2 GB of memory. The performance is sensitive to the number of clocks: on scenarios with 8 clocks the algorithm ran out of time.

#### **7 Conclusion**

Our case study illustrates the application of robust controller synthesis in small or moderate size problems. Our prototype relies on the DBM libraries that we use with twice as many clocks to store the constraints of the normalised constraint graphs. In order to scale to larger models, we plan to study extrapolation operators and their integration in the computation of reachability relations, which seems to be a challenging task. Different strategies can also be adopted for the double forward analysis, switching between the two modes using heuristics, a parallel implementation, etc.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Flexible Computational Pipelines for Robust Abstraction-Based Control Synthesis**

Eric S. Kim(B) , Murat Arcak , and Sanjit A. Seshia

UC Berkeley, Berkeley, CA, USA {eskim,arcak,sseshia}@eecs.berkeley.edu

**Abstract.** Successfully synthesizing controllers for complex dynamical systems and specifications often requires leveraging domain knowledge as well as making difficult computational or mathematical tradeoffs. This paper presents a flexible and extensible framework for constructing robust control synthesis algorithms and applies this to the traditional abstraction-based control synthesis pipeline. It is grounded in the theory of relational interfaces and provides a principled methodology to seamlessly combine different techniques (such as dynamic precision grids, refining abstractions while synthesizing, or decomposed control predecessors) or create custom procedures to exploit an application's intrinsic structural properties. A Dubins vehicle is used as a motivating example to showcase memory and runtime improvements.

**Keywords:** Control synthesis · Finite abstraction · Relational interface

#### **1 Introduction**

A control synthesizer's high level goal is to automatically construct control software that enables a closed loop system to satisfy a desired specification. A vast and rich literature contains results that mathematically characterize solutions to different classes of problems and specifications, such as the Hamilton-Jacobi-Isaacs PDE for differential games [3], Lyapunov theory for stabilization [8], and fixed-points for temporal logic specifications [11,17]. While many control synthesis problems have elegant mathematical solutions, there is often a gap between a solution's theoretical characterization and the algorithms used to compute it. What data structures are used to represent the dynamics and constraints? What operations should those data structures support? How should the control synthesis algorithm be structured? Implementing solutions to the questions above can require substantial time. This problem is especially critical for computationally challenging problems, where it is often necessary to let the user *rapidly* identify and exploit structure through analysis or experimentation.

The authors were funded in part by AFOSR FA9550-18-1-0253, DARPA Assured Autonomy project, iCyPhy, Berkeley Deep Drive, and NSF grant CNS-1739816.

**Fig. 1.** By expressing many different techniques within a common framework, users are able to rapidly develop methods to exploit system structure in controller synthesis.

#### **1.1 Bottlenecks in Abstraction-Based Control Synthesis**

This paper's goal is to enable a framework to develop extensible tools for robust controller synthesis. It was inspired in part by computational bottlenecks encountered in control synthesizers that construct finite abstractions of continuous systems, which we use as a target use case. A traditional abstraction-based control synthesis pipeline consists of three distinct stages:


This pipeline appears in tools PESSOA [12] and SCOTS [19], which can exhibit acute computational bottlenecks for high dimensional and nonlinear system dynamics. A common method to mitigate these bottlenecks is to exploit a specific dynamical system's topological and algebraic properties. In MASCOT [7] and CoSyMA [14], multi-scale grids and hierarchical models capture notions of state-space locality. One could incrementally construct an abstraction of the system dynamics while performing the control synthesis step [10,15] as implemented in tools ROCS [9] and ARCS [4]. The abstraction overhead can also be reduced by representing systems as a collection of components composed in parallel [6,13]. These have been developed in isolation and were not previously interoperable.

#### **1.2 Methodology**

Figure 1 depicts this paper's methodology and organization. The existing control synthesis formalism does not readily lend itself to algorithmic modifications that reflect and exploit structural properties in the system and specification. We use the theory of relational interfaces [22] as a foundation and augment it to express control synthesis pipelines. Interfaces are used to represent both system models and constraints. A small collection of atomic operators manipulates interfaces and is powerful enough to reconstruct many existing control synthesis pipelines.

One may also add new composite operators to encode desirable heuristics that exploit structural properties in the system and specifications. The last three sections encode the techniques for abstraction-based control synthesis from Sect. 1.1 within the relational interfaces framework. By deliberately deconstructing those techniques, then reconstructing them within a compositional framework it was possible to identify implicit or unnecessary assumptions then generalize or remove them. It also makes the aforementioned techniques interoperable amongst themselves as well as future techniques.

Interfaces come equipped with a refinement partial order that formalizes when one interface abstracts another. This paper focuses on preserving the refinement relation and sufficient conditions to refine discrete controllers back to concrete ones. Additional guarantees regarding completeness, termination, precision, or decomposability can be encoded, but impose additional requirements on the control synthesis algorithm and are beyond the scope of this paper.

#### **1.3 Contributions**

To our knowledge, the application of relational interfaces to robust abstractionbased control synthesis is new. The framework's building blocks consist of a collection of small, well understood operators that are nonetheless powerful enough to express many prior techniques. Encoding these techniques as relational interface operations forced us to simplify, formalize, or remove implicit assumptions in existing tools. The framework also exhibits numerous desirable features.


This paper's first half is domain agnostic and applicable to general robust control synthesis problems. The second half applies those insights to the finite abstraction approach to control synthesis. A smaller Dubins vehicle example is used to showcase and evaluate different techniques and their computational gains, compared to the unoptimized problem. In an extended version of this paper available at [1], a 6D lunar lander example leverages all techniques in this paper and introduces a few new ones.

#### **1.4 Notation**

Let = be an *assertion* that two objects are mathematically equivalent; as a special case '≡' is used when those two objects are sets. In contrast, the operator '==' *checks* whether two objects are equivalent, returning true if they are and false otherwise. A special instance of '==' is logical equivalence '⇔'.

Variables are denoted by lower case letters. Each variable v is associated with a domain of values D(v) that is analogous to the variable's type. A composite variable is a set of variables and is analogous to a bundle of wrapped wires. From a collection of variables v1,...,v<sup>M</sup> a composite variable v can be constructed by taking the union v ≡ v<sup>1</sup> ∪ ... ∪ v<sup>M</sup> and the domain D(v) ≡ -M <sup>i</sup>=1 D(vi). Note that the variables v1,...,v<sup>M</sup> above may themselves be composite. As an example if v is associated with a M-dimensional Euclidean space RM, then it is a composite variable that can be broken apart into a collection of atomic variables <sup>v</sup>1,...,v<sup>M</sup> where <sup>D</sup>(vi) <sup>≡</sup> <sup>R</sup> for all <sup>i</sup> ∈ {1,...,M}. The technical results herein do not distinguish between composite and atomic variables.

Predicates are functions that map variable assignments to a Boolean value. Predicates that stand in for expressions/formulas are denoted with capital letters. Predicates P and Q are logically equivalent (denoted by P ⇔ Q) if and only if P ⇒ Q and Q ⇒ P are true for all variable assignments. The universal and existential quantifiers ∀ and ∃ eliminate variables and yield new predicates. Predicates ∃wP and ∀wP do not depend on w. If w is a composite variable w ≡ w<sup>1</sup> ∪ ... ∪ w<sup>N</sup> then ∃wP is simply a shorthand for ∃w<sup>1</sup> ... ∃w<sup>N</sup> P.

#### **2 Control Synthesis for a Motivating Example**

As a simple, instructive example consider a planar Dubins vehicle that is tasked with reaching a desired location. Let x = {px, py, θ} be the collection of state variables, <sup>u</sup> <sup>=</sup> {v, ω} be a collection input variables to be controlled, <sup>x</sup><sup>+</sup> <sup>=</sup> {p<sup>+</sup> <sup>x</sup> , p<sup>+</sup> <sup>y</sup> , θ<sup>+</sup>} represent state variables at a subsequent time step, and <sup>L</sup> = 1.4 be a constant representing the vehicle length. The constraints

$$p\_x^+ = = p\_x + v \cos(\theta) \tag{F\_x}$$

$$p\_y^+ = = p\_y + v \sin(\theta) \tag{F\_y}$$

$$
\theta^+ = \theta + \frac{v}{L}\sin(\omega)\tag{F\_\theta}
$$

characterize the discrete time dynamics. The continuous state domain is D(x) ≡ [−2, 2] × [−2, 2] × [−π, π), where the last component is periodic so −π and π are identical values. The input domains are D(v) ≡ {0.25, 0.5} and D(ω) ≡ {−1.5, 0, 1.5}

Let predicate F = F<sup>x</sup> ∧ F<sup>y</sup> ∧ F<sup>θ</sup> represent the monolithic system dynamics. Predicate T depends only on x and represents the target set [−0.4, 0.4] × [−0.4, 0.4] × [−π, π), encoding that the vehicle's position must reach a square with any orientation. Let Z be a predicate that depends on variable x<sup>+</sup> that encodes a collection of states at a future time step. Equation (1) characterizes the robust controlled predecessor, which takes Z and computes the set of states from which there exists a non-blocking assignment to u that guarantees x<sup>+</sup> will satisfy <sup>Z</sup>, despite any non-determinism contained in <sup>F</sup>. The term <sup>∃</sup>x<sup>+</sup><sup>F</sup> prevents state-control pairs from blocking, while <sup>∀</sup>x<sup>+</sup>(<sup>F</sup> <sup>⇒</sup> <sup>Z</sup>) encodes the state-control pairs that guarantee satisfaction of Z.

$$\mathsf{cpre}(F, Z) = \exists u (\exists x^+ F \land \forall x^+ (F \Rightarrow Z)). \tag{1}$$

The controlled predecessor is used to solve safety and reach games. We can solve for a region for which the target T (respectively, safe set S) can be reached (made invariant) via an iteration of an appropriate reach (safe) operator. Both iterations are given by:

$$\text{ReachIter:} \qquad Z\_0 = \bot \qquad Z\_{i+1} = \mathsf{reach}(F, Z\_i, T) = \mathsf{cpre}(F, Z\_i) \lor T. \tag{2}$$

$$\text{SafetyIter:} \qquad Z\_0 = S \qquad Z\_{i+1} = \texttt{safe}(F, Z\_i, S) = \texttt{cpre}(F, Z\_i) \land S. \tag{3}$$

The above iterations are not guaranteed to reach a fixed point in a finite number of iterations, except under certain technical conditions [21]. Figure 2 depicts an approximate region where the controller can force the Dubins vehicle to enter T. We showcase different improvements relative to a base line script used to generate Fig. 2. A toolbox that adopts this paper's framework is being actively developed and is open sourced at [2]. It is written in python 3.6 and uses the dd package as an interface to CUDD [20], a library in C/C++ for constructing and manipulat-

**Fig. 2.** Approximate solution to the Dubins vehicle reach game visualized as a subset of the state space.

ing binary decision diagrams (BDD). All experiments were run on a single core of a 2013 Macbook Pro with 2.4 GHz Intel Core i7 and 8 GB of RAM.

The following section uses relational interfaces to represent the controlled predecessor cpre(·) and iterations (2) and (3) as a computational pipeline. Subsequent sections show how modifying this pipeline leads to favorable theoretical properties and computational gains.

#### **3 Relational Interfaces**

Relational interfaces are predicates augmented with annotations about each variable's role as an input or output<sup>1</sup>. They abstract away a component's internal implementation and only encode an input-output relation.

**Definition 1 (Relational Interface** [22]**).** *An interface* M(i, o) *consists of a predicate* M *over a set of input variables* i *and output variables* o*.*

For an interface M(i, o), we call (i, o) its input-output *signature*. An interface is a sink if it contains no outputs and has signature like (i, ∅), and a source if it contains no inputs like (∅, o). Sinks and source interfaces can be interpreted as sets whereas input-output interfaces are relations. Interfaces encode relations through their predicates and can capture features such as non-deterministic outputs or

<sup>1</sup> Relational interfaces closely resemble assume-guarantee contracts [16]; we opt to use relational interfaces because inputs and outputs play a more prominent role.

blocking (i.e., disallowed, error) inputs. A system blocks for an input assignment if there does not exist a corresponding output assignment that satisfies the interface relation. Blocking is a critical property used to declare *requirements*; sink interfaces impose constraints by modeling constrain violations as blocking inputs. Outputs on the other hand exhibit non-determinism, which is treated as an *adversary*. When one interface's outputs are connected to another's inputs, the outputs seek to cause blocking whenever possible.

#### **3.1 Atomic and Composite Operators**

Operators are used to manipulate interfaces by taking interfaces and variables as inputs and yielding another interface. We will show how the controlled predecessor cpre(·) in (1) can be constructed by composing operators appearing in [22] and one additional one. The first, output hiding, removes interface outputs.

**Definition 2 (Output Hiding** [22]**).** *Output hiding operator ohide*(w, F) *over interface* F(i, o) *and outputs* w *yields an interface with signature* (i, o \ w)*.*

$$
\vec{\omega}\,\bullet\,\vec{\text{id}}\,\mathbf{\dot{e}}(w,F) = \exists wF \tag{4}
$$

Existentially quantifying out w ensures that the input-output behavior over the unhidden variables is still consistent with potential assignments to w. The operator nb(·) is a special variant of *ohide*(·) that hides all outputs, yielding a sink encoding all non-blocking inputs to the original interface.

**Definition 3 (Nonblocking Inputs Sink).** *Given an interface* F(i, o)*, the nonblocking operation nb(F) yields a sink interface with signature* (i, ∅) *and predicate nb*(F) = <sup>∃</sup>oF*. If* <sup>F</sup>(i, <sup>∅</sup>) *is a sink interface, then nb*(F) = <sup>F</sup> *yields itself. If* <sup>F</sup>(∅, o) *is a source interface, then nb*(F) = <sup>⊥</sup> *if and only if* <sup>F</sup> ⇔ ⊥*; otherwise nb*(F) = *.*

The interface composition operator takes multiple interfaces and "collapses" them into a single input-output interface. It can be viewed as a generalization of function composition in the special case where each interface encodes a total function (i.e., deterministic output and inputs never block).

**Definition 4 (Interface Composition** [22]**).** *Let* F1(i1, o1) *and* F2(i2, o2) *be interfaces with disjoint output variables* <sup>o</sup><sup>1</sup> <sup>∩</sup> <sup>o</sup><sup>2</sup> <sup>≡</sup> <sup>∅</sup> *and* <sup>i</sup><sup>1</sup> <sup>∩</sup> <sup>o</sup><sup>2</sup> <sup>≡</sup> <sup>∅</sup> *which signifies that* F2*'s outputs may not be fed back into* F1*'s inputs. Define new composite variables*

$$i o\_{12} \equiv o\_1 \cap i\_2 \tag{5}$$

$$i\_{12} \equiv (i\_1 \cup i\_2) \mid i o\_{12} \tag{6}$$

$$o\_{12} \equiv o\_1 \cup o\_2. \tag{7}$$

*Composition comp*(F1, F2) *is an interface with signature* (i12, o12) *and predicate*

$$(F\_1 \land F\_2 \land \forall o\_{12}(F\_1 \Rightarrow nb(F\_2)).\tag{8}$$

*Interface subscripts may be swapped if instead* F2*'s outputs are fed into* F1*.*

Interfaces <sup>F</sup><sup>1</sup> and <sup>F</sup><sup>2</sup> are composed in parallel if io<sup>21</sup> <sup>≡</sup> <sup>∅</sup> holds in addition to io<sup>12</sup> <sup>≡</sup> <sup>∅</sup>. Equation (8) under parallel composition reduces to <sup>F</sup><sup>1</sup> <sup>∧</sup> <sup>F</sup><sup>2</sup> (Lemma 6.4 in [22]) and comp(·) is commutative and associative. If io<sup>12</sup> ≡ <sup>∅</sup>, then they are composed in series and the composition operator is only associative. Any acyclic interconnection can be composed into a single interface by systematically applying Definition 4's binary composition operator. Non-deterministic outputs are interpreted to be *adversarial*. Series composition of interfaces has a built-in notion of robustness to account for F1's non-deterministic outputs and blocking inputs to F<sup>2</sup> over the shared variables io12. The term ∀o12(F<sup>1</sup> ⇒ nb(F2)) in Eq. (8) is a predicate over the composition's input set i12. It ensures that if a potential output of F<sup>1</sup> may cause F<sup>2</sup> to block, then comp(F1, F2) must preemptively block.

The final atomic operator is input hiding, which may only be applied to sinks. If the sink is viewed as a constraint, an input variable is "hidden" by an angelic environment that chooses an input assignment to satisfy the constraint. This operator is analogous to projecting a set into a lower dimensional space.

**Definition 5 (Hiding Sink Inputs).** *Input hiding operator ihide*(w, F) *over sink interface* <sup>F</sup>(i, <sup>∅</sup>) *and inputs* <sup>w</sup> *yields an interface with signature* (<sup>i</sup> \ w, <sup>∅</sup>)*.*

$$
\vec{w}\hbar\vec{u}\,\mathrm{d}\mathbf{e}(w,F) = \exists wF\tag{9}
$$

Unlike the composition and output hiding operators, this operator is not included in the standard theory of relational interfaces [22] and was added to encode a controller predecessor introduced subsequently in Eq. (10).

#### **3.2 Constructing Control Synthesis Pipelines**

The robust controlled predecessor (1) can be expressed through operator composition.

**Proposition 1.** *The controlled predecessor operator* (10) *yields a sink interface with signature* (x, ∅) *and predicate equivalent to the predicate in* (1)*.*

$$\mathsf{cpre}(F, Z) = \mathsf{i}\,\mathsf{h}\,\mathsf{i}\,\mathsf{d}\,\mathsf{e}(u, \mathsf{o}\,\mathsf{i}\,\mathsf{i}\,\mathsf{d}\,\mathsf{e}(x^+, \mathsf{comp}(F, Z))).\tag{10}$$

The simple proof is provided in the extended version at [1]. Proposition 1 signifies that controlled predecessors can be interpreted as an instance of robust composition of interfaces, followed by variable hiding. It can be shown that safe(F, Z, S) = comp(cpre(F, Z), S) because S(x, ∅) and cpre(F, Z) would be composed in parallel.<sup>2</sup> Figure. 3 shows a visualization of the safety game's fixed point iteration from the point of view of relational interfaces. Starting from the right-most sink interface S (equivalent to Z0) the iteration (3) constructs a sequence of sink interfaces Z1, Z2, ... encoding relevant subsets of the state space. The numerous S(x, ∅) interfaces impose constraints and can be interpreted as monitors that raise errors if the safety constraint is violated.

<sup>2</sup> Disjunctions over sinks are required to encode reach(·). This will be enabled by the shared refinement operator defined in Definition 10.

**Fig. 3.** Safety control synthesis iteration (3) depicted as a sequence of sink interfaces.

#### **3.3 Modifying the Control Synthesis Pipeline**

Equation (10)'s definition of cpre(·) is oblivious to the domains of variables x, u, and x<sup>+</sup>. This generality is useful for describing a problem and serving as a blank template. Whenever problem structure exists, pipeline modifications refine the general algorithm into a form that reflects the specific problem instance. They also allow a user to inject implicit preferences into a problem and reduce computational bottlenecks or to refine a solution. The subsequent sections apply this philosophy to the abstraction-based control techniques from Sect. 1.1:


These sections do more than simply reconstruct existing techniques in the language of relational interfaces. They uncover some implicit assumptions in existing tools and either remove them or make them explicit. Minimizing the number of assumptions ensures applicability to a diverse collection of systems and specifications and compatibility with future algorithmic modifications.

#### **4 Interface Abstraction via Quantization**

A key motivator behind abstraction-based control synthesis is that computing the game iterations from Eqs. (2) and (3) exactly is often intractable for highdimensional nonlinear dynamics. Termination is also not guaranteed. Quantizing (or "abstracting") continuous interfaces into a finite counterpart ensures that each predicate operation of the game terminates in finite time but at the cost of the solution's precision. Finer quantization incurs a smaller loss of precision but can cause the memory and computational requirements to store and manipulate the symbolic representation to exceed machine resources.

This section first introduces the notion of interface abstraction as a refinement relation. We define the notion of a quantizer and show how it is a simple generalization of many existing quantizers in the abstraction-based control literature. Finally, we show how one can inject these quantizers anywhere in the control synthesis pipeline to reduce computational bottlenecks.

#### **4.1 Theory of Abstract Interfaces**

While a controller synthesis algorithm can analyze a simpler model of the dynamics, the results have no meaning unless they can be extrapolated back to the original system dynamics. The following interface refinement condition formalizes a condition when this extrapolation can occur.

**Definition 6 (Interface Refinement** [22]**).** *Let* F(i, o) *and* Fˆ(ˆi, oˆ) *be interfaces.* <sup>F</sup><sup>ˆ</sup> *is an abstraction of* <sup>F</sup> *if and only if* <sup>i</sup> <sup>≡</sup> <sup>ˆ</sup>i*,* <sup>o</sup> <sup>≡</sup> <sup>o</sup>ˆ*, and*

$$
\mathfrak{nb}(\hat{F}) \Rightarrow \mathfrak{nb}(F) \tag{11}
$$

$$\left(\mathsf{nb}(\hat{F}) \land F\right) \Rightarrow \hat{F} \tag{12}$$

*are valid formulas. This relationship is denoted by* <sup>F</sup><sup>ˆ</sup> <sup>F</sup>*.*

Definition 6 imposes two main requirements between a concrete and abstract interface. Equation (11) encodes the condition where if Fˆ accepts an input, then F must also accept it; that is, the abstract component is more aggressive with rejecting invalid inputs. Second, if both systems accept the input then the abstract output set is a superset of the concrete function's output set. The abstract interface is a conservative representation of the concrete interface because the abstraction accepts fewer inputs and exhibits more non-deterministic outputs. If both the interfaces are sink interfaces, then <sup>F</sup><sup>ˆ</sup> <sup>F</sup> reduces down to <sup>F</sup><sup>ˆ</sup> <sup>⊆</sup> <sup>F</sup> when F, <sup>F</sup><sup>ˆ</sup> are interpreted as sets. If both are source interfaces then the set containment direction is flipped and <sup>F</sup><sup>ˆ</sup> <sup>F</sup> reduces down to <sup>F</sup> <sup>⊆</sup> <sup>F</sup>ˆ.

The refinement relation satisfies the required reflexivity, transitivity, and antisymmetry properties to be a partial order [22] and is depicted in Fig. 4. This order has a bottom element ⊥ which is a universal abstraction. Conveniently, the bottom element ⊥ signifies both boolean false and the bottom of the partial order. This interface blocks for every potential input. In contrast, Boolean  plays no special role in the partial order. While  exhibits totally non-deterministic outputs, it also accepts inputs. A blocking input is considered "worse" than non-deterministic outputs in the refinement order. The refinement relation encodes a direction of conservatism such that any reasoning done over the abstract models is sound and can be generalized to the concrete model.

**Theorem 1 (Informal Substitutability Result** [22]**).** *For any input that is allowed for the abstract model, the output behaviors exhibited by an abstract model contains the output behaviors exhibited by the concrete model.*

**Fig. 4.** Example depiction of the refinement partial order. Each small plot on the depicts input-output pairs that satisfy an interface's predicate. Inputs (outputs) vary along the horizontal (vertical) axis. Because *B* blocks on some inputs but *A* accepts all inputs *B* - *A*. Interface *C* exhibits more output non-determinism than *A* so *C* - *A*. Similarly *D* - *B*, *D* - *C*, - *C*, etc. Note that *B* and *C* are incomparable because *C* exhibits more output non-determinism and *B* blocks for more inputs. The false interface ⊥ is a universal abstraction, while is incomparable with *B* and *D*.

If a property on outputs has been established for an abstract interface, then it still holds if the abstract interface is replaced with the concrete one. Informally, the abstract interface is more conservative so if a property holds with the abstraction then it must also hold for the true system. All aforementioned interface operators preserve the properties of the refinement relation of Definition 6, in the sense that they are monotone with respect to the refinement partial order.

**Theorem 2 (Composition Preserves Refinement** [22]**).** *Let* <sup>A</sup><sup>ˆ</sup> <sup>A</sup> *and* <sup>B</sup><sup>ˆ</sup> <sup>B</sup>*. If the composition is well defined, then comp*(A, <sup>ˆ</sup> <sup>B</sup>ˆ) *comp*(A, B)*.*

**Theorem 3 (Output Hiding Preserves Refinement** [22]**).** *If* A B*, then ohide*(w, A) *ohide*(w, B) *for any variable* w*.*

**Theorem 4 (Input Hiding Preserves Refinement).** *If* A, B *are both sink interfaces and* A B*, then ihide*(w, A) *ihide*(w, B) *for any variable* w*.*

Proofs for Theorems 2 and 3 are provided in [22]. Theorem 4's proof is simple and is omitted. One can think of using interface composition and variable hiding to horizontally (with respect to the refinement order) navigate the space of all interfaces. The synthesis pipeline encodes one navigated path and monotonicity of these operators yields guarantees about the path's end point. Composite operators such as cpre(·) chain together multiple incremental steps. Furthermore since the composition of monotone operators is itself a monotone operator, any composite constructed from these parts is also monotone. In contrast, the coarsening and refinement operators introduced later in Definitions 8 and 10 respectively are used to move vertically and construct abstractions. The "direction" of new composite operators can easily be established through simple reasoning about the cumulative directions of their constituent operators.

**Fig. 5.** Coarsening of the *<sup>F</sup>*<sup>x</sup> interface to 2<sup>3</sup>*,* <sup>2</sup><sup>4</sup> and 2<sup>5</sup> bins along each dimension for a fixed *v* assignment. Interfaces are coarsened within milliseconds for BDDs but the runtime depends on the finite abstraction's data structure representation.

#### **4.2 Dynamically Coarsening Interfaces**

In practice, the sequence of interfaces Z<sup>i</sup> generated during synthesis grows in complexity. This occurs even if the dynamics F and the target/safe sets have compact representations (i.e., fewer nodes if using BDDs). Coarsening F and Z<sup>i</sup> combats this growth in complexity by effectively reducing the amount of information sent between iterations of the fixed point procedure.

Spatial discretization or *coarsening* is achieved by use of a quantizer interface that implicitly aggregates points in a space into a partition or cover.

**Definition 7.** *A quantizer* Q(i, o) *is any interface that abstracts the identity interface* (i == o) *associated with the signature* (i, o)*.*

Quantizers decrease the complexity of the system representation and make synthesis more computationally tractable. A coarsening operator abstracts an interface by connecting it in series with a quantizer. Coarsening reduces the number of non-blocking inputs and increases the output non-determinism.

**Definition 8 (Input/Output Coarsening).** *Given an interface* F(i, o) *and input quantizer* Q(ˆi, i)*, input coarsening yields an interface with signature* (ˆi, o)*.*

$$\iota \iota \alpha \iota \iota \iota \iota \iota (F, Q(\hat{i}, i)) = \iota \iota \iota \iota \text{d}\mathfrak{e}(i, \textsf{comp}(Q(\hat{i}, i), F)) \tag{13}$$

*Similarly, given an output quantizer* Q(o, oˆ)*, output coarsening yields an interface with signature* (i, oˆ)*.*

$$\rho \circ \alpha \circ \text{sen}(F, Q(o, \hat{o})) = \rho \,\text{idde}(o, \text{comp}(F, Q(o, \hat{o}))) \tag{14}$$

Figure 5 depicts how coarsening reduces the information required to encode a finite interface. It leverages a variable precision quantizer, whose implementation is described in the extended version at [1].

The corollary below shows that quantizers can be seamlessly integrated into the synthesis pipeline while preserving the refinement order. It readily follows from Theorems 2, 3, and the quantizer definition.

**Corollary 1.** *Input and output coarsening operations* (13) *and* (14) *are monotone operations with respect to the interface refinement order .*

**Fig. 6.** Number of BDD nodes (red) and number of states in reach basin (blue) with respect to the reach game iteration with a greedy quantization. The solid lines result from the unmodified game with no coarsening heuristic. The dashed lines result from greedy coarsening whenever the winning region exceeds 3000 BDD nodes. (Color figure online)

It is difficult to know a priori where a specific problem instance lies along the spectrum between mathematical precision and computational efficiency. It is then desirable to coarsen dynamically in response to runtime conditions rather than statically beforehand. Coarsening heuristics for reach games include:


The most common quantizer in the literature never blocks and only increases non-determinism (such quantizers are called "strict" in [18,19]). If a quantizer is interpreted as a partition or cover, this requirement means that the union must be equal to an entire space. Definition 7 relaxes that requirement so the union can be a subset instead. It also hints at other variants such as interfaces that don't increase output non-determinism but instead block for more inputs.

#### **5 Refining System Dynamics**

Shared refinement [22] is an operation that takes two interfaces and merges them into a single interface. In contrast to coarsening, it makes interfaces more precise. Many tools construct system abstractions by starting from the universal abstraction ⊥, then iteratively refining it with a collection of smaller interfaces that represent input-output samples. This approach is especially useful if the canonical concrete system is a black box function, Simulink model, or source code file. These representations do not readily lend themselves to the predicate operations or be coarsened directly. We will describe later how other tools implement a restrictive form of refinement that introduces unnecessary dependencies.

Interfaces can be successfully merged whenever they do not contain contradictory information. The shared refinability condition below formalizes when such a contradiction does not exist.

**Definition 9 (Shared Refinability** [22]**).** *Interfaces* F1(i, o) *and* F2(i, o) *with identical signatures are shared refinable if*

$$(\mathfrak{nb}(F\_1) \land \mathfrak{nb}(F\_2)) \Rightarrow \exists o(F\_1 \land F\_2) \tag{15}$$

For any inputs that do not block for all interfaces, the corresponding output sets must have a non-empty intersection. If multiple shared refinable interfaces, then they can be combined into a single one that encapsulates all of their information.

**Definition 10 (Shared Refinement Operation** [22]**).** *The shared refinement operation combines two shared refinable interfaces* F<sup>1</sup> *and* F2*, yielding a new identical signature interface corresponding to the predicate*

$$prefix(F\_1, F\_2) = (\mathfrak{nb}(F\_1) \lor \mathfrak{nb}(F\_2)) \land (\mathfrak{nb}(F\_1) \Rightarrow F\_1) \land (\mathfrak{nb}(F\_2) \Rightarrow F\_2). \tag{16}$$

The left term expands the set of accepted inputs. The right term signifies that if an input was accepted by multiple interfaces, the output must be consistent with each of them. The shared refinement operation reduces to disjunction for sink interfaces and to conjunction for source interfaces.

Shared refinement's effect is to move up the refinement order by combining interfaces. Given a collection of shared refinable interfaces, the shared refinement operation yields the least upper bound with respect to the refinement partial order in Definition 6. Violation of (15) can be detected if the interfaces fed into refine(·) are not abstractions of the resulting interface.

#### **5.1 Constructing Finite Interfaces Through Shared Refinement**

A common method to construct finite abstractions is through simulation and overapproximation of forward reachable sets. This technique appears in tools such as PESSOA [12], SCOTS [19], MASCOT [7], ROCS [9] and ARCS [4]. By covering a sufficiently large portion of the interface input space, one can construct larger composite interfaces from smaller ones via shared refinement.

**Fig. 7.** (Left) Result of sample and coarsen operations for control system interface *<sup>F</sup>*(*x*∪*u, x*<sup>+</sup>). The *<sup>I</sup>* and *<sup>I</sup>* ˆ interfaces encode the same predicate, but play different roles as sink and source. (Right) Visualization of finite abstraction as traversing the refinement partial order. Nodes represent interfaces and edges signify data dependencies for interface manipulation operators. Multiple refine edges point to a single node because refinement combines multiple interfaces. Input-output (IO) sample and coarsening are unary operations so the resulting nodes only have one incoming edge. The concrete interface *F* refines all others, and the final result is an abstraction *F*ˆ.

Smaller interfaces are constructed by sampling regions of the input space and constructing an input-output pair. In Fig. 7's left half, a sink interface <sup>I</sup>(x∪u, <sup>∅</sup>) acts as a filter. The source interface <sup>ˆ</sup>I(∅, x <sup>∪</sup> <sup>u</sup>) composed with <sup>F</sup>(<sup>x</sup> <sup>∪</sup> u, x<sup>+</sup>) prunes any information that is outside the relevant input region. The original interface refines any sampled interface. To make samples *finite*, interface inputs and outputs are coarsened. An individual sampled abstraction is not useful for synthesis because it is restricted to a local portion of the interface input space. After sampling many finite interfaces are merged through shared refinement. The assumption <sup>ˆ</sup>I<sup>i</sup> <sup>⇒</sup> nb(F) encodes that the dynamics won't raise an error when simulated and is often made implicitly. Figure 7's right half depicts the sample, coarsen, and refine operations as methods to vertically traverse the interface refinement order.

Critically, refine(·) can be called within the synthesis pipeline and does not assume that the sampled interfaces are disjoint. Figure 8 shows the results from refining the dynamics with a collection of state-control hyper-rectangles that are randomly generated via uniformly sampling their widths and offsets along each dimension. These hyper-rectangles may overlap. If the same collection of hyper-rectangles were used in MASCOT, SCOTS, ARCS, or ROCS then this would yield a much more conservative abstraction of the dynamics because their implementations are not robust to overlapping or misaligned samples. PESSOA and SCOTS circumvent this issue altogether by enforcing disjointness with an exhaustive traversal of the state-control space, at the cost of unnecessarily coupling the refinement and sampling procedures. The lunar lander in the extended version [1] embraces overlapping and uses two mis-aligned grids to construct a grid partition with p<sup>N</sup> elements with only p<sup>N</sup> ( <sup>1</sup> <sup>2</sup> )<sup>N</sup>−<sup>1</sup> samples (where p is the number of bins along each dimension and N is the interface input dimension). This technique introduces a small degree of conservatism but its computational savings typically outweigh this cost.

**Fig. 8.** The number of states in the computed reach basin grows with the number of random samples. The vertical axis is lower bounded by the number of states in the target 131*k* and upper bounded by 631*k*, the number of states using an exhaustive traversal. Naive implementations of the exhaustive traversal would require 12 million samples. The right shows basins for 3000 (top) and 6000 samples (bottom).

#### **6 Decomposed Control Predecessor**

A decomposed control predecessor is available whenever the system state space consists of a Cartesian product and the dynamics are decomposed componentwise such as Fx, Fy, and F<sup>θ</sup> for the Dubins vehicle. This property is common for continuous control systems over Euclidean spaces. While one may construct F directly via the abstraction sampling approach, it is often intractable for larger dimensional systems. A more sophisticated approach abstracts the lower dimensional components Fx, Fy, and F<sup>θ</sup> individually, computes F = comp(Fx, Fy, Fθ), then feeds it to the monolithic cpre(·) from Proposition 1. This section's approach is to avoid computing F at all and decompose the monolithic cpre(·). It operates by breaking apart the term *ohide*(x<sup>+</sup>, comp(F, Z)) in such a way that it respects the decomposition structure. For the Dubins vehicle example *ohide*(x<sup>+</sup>, comp(F, Z)) is replaced with

$$\mathsf{lookup}(p\_x^+, \mathsf{comp}(F\_x, o\mathsf{h} \, i \, \mathsf{de}(p\_y^+, \mathsf{comp}(F\_y, o\mathsf{h} \, i \, \mathsf{de}(\theta^+, \mathsf{comp}(F\_\theta, Z)))))$$

yielding a sink interface with inputs px, py, v, θ, and ω. This representation and the original *ohide*(x<sup>+</sup>, comp(F, Z)) are equivalent because comp(·) is associative and interfaces do not share outputs <sup>x</sup><sup>+</sup> ≡ {p<sup>+</sup> <sup>x</sup> , p<sup>+</sup> <sup>y</sup> , θ<sup>+</sup>}. Figure <sup>9</sup> shows multiple variants of cpre(·) and improved runtimes when one avoids preemptively constructing the monolithic interface. The decomposed cpre(·) resembles techniques to exploit partitioned transition relations in symbolic model checking [5].

No tools from Sect. 1.1 natively support decomposed control predecessors. We've shown a decomposed abstraction for components composed in parallel

**Fig. 9.** A monolithic cpre(·) incurs unnecessary pre-processing and synthesis runtime costs for the Dubins vehicle reach game. Each variant of cpre(·) above composes the interfaces *F*x*, F*<sup>y</sup> and *F*<sup>θ</sup> in different permutations. For example, *F*xy represents comp(*F*x*, F*y) and *F* represents comp(*F*x*, F*y*, F*θ).

but this can also be generalized to series composition to capture, for example, a system where multiple components have different temporal sampling periods.

#### **7 Conclusion**

Tackling difficult control synthesis problems will require exploiting *all* available structure in a system with tools that can *flexibly adapt* to an individual problem's idiosyncrasies. This paper lays a foundation for developing an extensible suite of interoperable techniques and demonstrates the potential computational gains in an application to controller synthesis with finite abstractions. Adhering to a simple yet powerful set of well-understood primitives also constitutes a disciplined methodology for algorithm development, which is especially necessary if one wants to develop concurrent or distributed algorithms for synthesis.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Temporal Stream Logic: Synthesis Beyond the Bools**

Bernd Finkbeiner<sup>1</sup>, Felix Klein1(B), Ruzica Piskac<sup>2</sup>, and Mark Santolucito<sup>2</sup>

<sup>1</sup> Saarland University, Saarbr¨ucken, Germany klein@react.uni-saarland.de <sup>2</sup> Yale University, New Haven, USA

**Abstract.** Reactive systems that operate in environments with complex data, such as mobile apps or embedded controllers with many sensors, are difficult to synthesize. Synthesis tools usually fail for such systems because the state space resulting from the discretization of the data is too large. We introduce TSL, a new temporal logic that separates control and data. We provide a CEGAR-based synthesis approach for the construction of implementations that are guaranteed to satisfy a TSL specification for all possible instantiations of the data processing functions. TSL provides an attractive trade-off for synthesis. On the one hand, synthesis from TSL, unlike synthesis from standard temporal logics, is undecidable in general. On the other hand, however, synthesis from TSL is scalable, because it is independent of the complexity of the handled data. Among other benchmarks, we have successfully synthesized a music player Android app and a controller for an autonomous vehicle in the Open Race Car Simulator (TORCS).

#### **1 Introduction**

In reactive synthesis, we automatically translate a formal specification, typically given in a temporal logic, into a controller that is guaranteed to satisfy the specification. Over the past two decades there has been much progress on reactive synthesis, both in terms of algorithms, notably with techniques like GR(1) synthesis [7] and bounded synthesis [20], and in terms of tools, as showcased, for example, in the annual syntcomp competition [25].

In practice however, reactive synthesis has seen limited success. One of the largest published success stories [6] is the synthesis of the AMBA bus protocol. To push synthesis even further, automatically synthesizing a controller for

Supported by the European Research Council (ERC) Grant OSARES (No. 683300), the German Research Foundation (DFG) as part of the Collaborative Research Center Foundations of Perspicuous Software Systems (TRR 248, 389792660), and the National Science Foundation (NSF) Grant CCF-1302327.

an autonomous system has been recognized to be of critical importance [52]. Despite many years of experience with synthesis tools, our own attempts to synthesize such controllers with existing tools have been unsuccessful. The reason is that the tools are unable to handle the data complexity of the controllers. The controller only needs to switch between a small number of behaviors, like steering during a bend, or shifting gears on high rpm. The number of control states in a typical controller (cf. [18]) is thus not much different from the arbiter in the AMBA case study. However, in order to correctly initiate transitions between control states, the driving controller must continuously process data from more than 20 sensors.

If this data is included (even as a rough discretization) in the state space of the controller, then the synthesis problem is much too large to be handled by any available tools. It seems clear then, that a scalable synthesis approach must separate control and data. If we assume that the data processing is handled by some other approach (such as deductive synthesis [38] or manual programming), is it then possible to solve the remaining reactive synthesis problem?

In this paper, we show scalable reactive synthesis is indeed possible. Separating data and control has allowed us to synthesize reactive systems, including an autonomous driving controller and a music player app, that had been impossible to synthesize with previously available tools. However, the separation of data and control implies some fundamental changes to reactive synthesis, which we describe in the rest of the paper. The changes also imply that the reactive synthesis problem is no longer, in general, decidable. We thus trade theoretical decidability for practical scalability, which is, at least with regard to the goal of synthesizing realistic systems, an attractive trade-off.

We introduce Temporal Stream Logic (TSL), a new temporal logic that includes *updates*, such as y f x, and predicates over arbitrary function terms. The update y f x indicates that the result of applying function f to variable x is assigned to y. The implementation of predicates and functions is not part of the synthesis problem. Instead, we look for a system that satisfies the TSL specification *for all possible interpretations of the functions and predicates*.

This implicit quantification over all possible interpretations provides a useful abstraction: it allows us to *independently* implement the data processing part. On the other hand, this quantification is also the reason for the undecidability of the synthesis problem. If a predicate is applied to the same term *twice*, it must (independently of the interpretation) return the *same* truth value. The synthesis must then implicitly maintain a (potentially infinite) set of terms to which the predicate has previously been applied. As we show later, this set of terms can be used to encode PCP [45] for a proof of undecidability.

We present a practical synthesis approach for TSL specifications, which is based on bounded synthesis [20] and counterexample-guided abstraction refinement (CEGAR) [9]. We use bounded synthesis to search for an implementation up to a (iteratively growing) bound on the number of states. This approach underapproximates the actual TSL synthesis problem by leaving the interpretation of the predicates to the environment. The underapproximation allows

**Fig. 1.** The TSL synthesis procedure uses a modular design. Each step takes input from the previous step as well as interchangeable modules (dashed boxes).

for inconsistent behaviors: the environment might assign different truth values to the same predicate when evaluated at different points in time, even if the predicate is applied to the same term. However, if we find an implementation in this underapproximation, then the CEGAR loop terminates and we have a correct implementation for the original TSL specification. If we do not find an implementation in the underapproximation, we compute a counter strategy for the environment. Because bounded synthesis reduces the synthesis problem to a safety game, the counter strategy is a reachability strategy that can be represented as a finite tree. We check whether the counter strategy is spurious by searching for a pair of positions in the strategy where some predicate results in different truth values when applied to the same term. If the counter strategy is not spurious, then no implementation exists for the considered bound, and we increase the bound. If the counter strategy is spurious, then we introduce a constraint into the specification that eliminates the incorrect interpretation of the predicate, and continue with the refined specification.

A general overview of this procedure is shown in Fig. 1. The top half of the figure depicts the bounded search for an implementation that realizes a TSL specification using the CEGAR loop to refine the specification. If the specification is realizable, we proceed in the bottom half of the process, where a synthesized implementation is converted to a control flow model (CFM) determining the control of the system. We then specialize the CFM to Functional Reactive Programming (FRP), which is a popular and expressive programming paradigm for building reactive programs using functional programming languages [14].

**Fig. 2.** Sample code and specification for the music player app.

Our framework supports any FRP library using the *Arrow* or *Applicative* design patterns, which covers most of the existing FRP libraries (e.g. [2,3,10,41]). Finally, the synthesized control flow is embedded into a project context, where it is equipped with function and predicate implementations and then compiled to an executable program.

Our experience with synthesizing systems based on TSL specifications has been extremely positive. The synthesis works for a broad range of benchmarks, ranging from classic reactive synthesis problems (like escalator control), through programming exercises from functional reactive programming, to novel case studies like our music player app and the autonomous driving controller for a vehicle in the Open Race Car Simulator (TORCS).

#### **2 Motivating Example**

To demonstrate the utility of our method, we synthesized a music player Android app<sup>1</sup> from a TSL specification. A major challenge in developing Android apps is the temporal behavior of an app through the *Android lifecycle* [46]. The Android lifecycle describes how an app should handle being paused, when moved to the background, coming back into focus, or being terminated. In particular, *resume and restart errors* are commonplace and difficult to detect and correct [46]. Our music player app demonstrates a situation in which a resume and restart error could be unwittingly introduced when programming by hand, but is avoided by providing a specification. We only highlight the key parts of the example here to give an intuition of TSL. The complete specification is presented in [19].

Our music player app utilizes the Android music player library (MP), as well as its control interface (Ctrl). It pauses any playing music when moved to the background (for instance if a call is received), and continues playing the currently selected track (Tr) at the last track position when the app is resumed. In the Android system (Sys), the leaveApp method is called whenever the app moves to the background, while the resumeApp method is called when the app is brought back to the foreground. To avoid confusion between pausing music and pausing the app, we use leaveApp and resumeApp in place of the Android methods

<sup>1</sup> https://play.google.com/store/apps/details?id=com.mark.myapplication.

**Fig. 3.** The effect of a minor change in functionality on code versus a specification.

onPause and onResume. A programmer might manually write code for this as shown on the left in Fig. 2.

The behavior of this can be directly described in TSL as shown on the right in Fig. 2. Even eliding a formal introduction of the notation for now, the specification closely matches the textual specification. First, when the user leaves the app and the music is playing, the music pauses. Likewise for the second part, when the user resumes the app, the music starts playing again.

However, assume we want to change the behavior so that the music only plays on resume when the music had been playing before leaving the app in the first place. In the manually written program, this new functionality requires an additional variable wasPlaying to keep track of the music state. Managing the state requires multiple changes in the code as shown on the left in Fig. 3. The required code changes include: a conditional in the resumeApp method, setting wasPlaying appropriately in two places in leaveApp, and providing an initial value. Although a small example, it demonstrates how a minor change in functionality may require wide-reaching code changes. In addition, this change introduces a globally scoped variable, which then might accidentally be set or read elsewhere. In contrast, it is a simple matter to change the TSL specification to reflect this new functionality. Here, we only update one part of the specification to say that if the user leaves the app and the music is playing, the music has to play again as soon as the app resumes.

Synthesis allows us to specify a temporal behavior without worrying about the implementation details. In this example, writing the specification in TSL has eliminated the need of an additional state variable, similarly to a higher order map eliminating the need for an iteration variable. However, in more complex examples the benefits compound, as TSL provides a modular interface to specify behaviors, offloading the management of multiple interconnected temporal behaviors from the user to the synthesis engine.

#### **3 Preliminaries**

We assume time to be discrete and denote it by the set N of positive integers. A value is an arbitrary object of arbitrary type. V denotes the set of all values. The Boolean values are denoted by B⊆V. A stream <sup>s</sup>: <sup>N</sup> → V is a function fixing values at each point in time. An <sup>n</sup>-ary function <sup>f</sup> : <sup>V</sup><sup>n</sup> → V determines new values from n given values, where the set of all functions (of arbitrary arity) is given by F. Constants are functions of arity 0. Every constant is a value, i.e., is an element of F∩V. An <sup>n</sup>-ary predicate <sup>p</sup>: <sup>V</sup><sup>n</sup> → B checks a property over <sup>n</sup> values. The set of all predicates (of arbitrary arity) is given by P, where P⊆F. We use B[A] to denote the set of all total functions with domain A and image B.

In the classical synthesis setting, inputs and outputs are vectors of Booleans, where the standard abstraction treats inputs and outputs as atomic propositions I∪O, while their Boolean combinations form an alphabet Σ = 2I∪O. Behavior then is described through infinite sequences <sup>α</sup> <sup>=</sup> <sup>α</sup>(0)α(1)α(2)... <sup>∈</sup> <sup>Σ</sup><sup>ω</sup>. A *specification* describes a relation between input sequences <sup>α</sup> <sup>∈</sup> (2I)<sup>ω</sup> and output sequences <sup>β</sup> <sup>∈</sup> (2O)<sup>ω</sup>. Usually, this relation is not given by explicit sequences, but by a fomula in a temporal logic. The most popular such logic is Linear Temporal Logic (LTL) [43], which uses Boolean connectives to specify behavior at specific points in time, and temporal connectives, to relate sub-specifications over time. The realizability and synthesis problems for LTL are 2ExpTime-complete [44].

An implementation describes a realizing strategy, formalized via infinite trees. A Φ-labeled and Υ-branching tree is a function σ : Υ <sup>∗</sup> → Φ, where Υ denotes the set of branching directions along a tree. Every node of the tree is given by a finite prefix v ∈ Υ <sup>∗</sup>, which fixes the path to reach a node from the root. Every node is labeled by an element of <sup>Φ</sup>. For infinite paths <sup>ν</sup> <sup>∈</sup> <sup>Υ</sup> <sup>ω</sup>, the branch <sup>σ</sup><sup>ν</sup> denotes the sequence of labels that appear on <sup>ν</sup>, i.e., <sup>∀</sup><sup>t</sup> <sup>∈</sup> <sup>N</sup>. (σ <sup>ν</sup>)(t) = <sup>σ</sup>(ν(0)...ν(<sup>t</sup> <sup>−</sup> 1)).

#### **4 Temporal Stream Logic**

We present a new logic: Temporal Stream Logic (TSL), which is especially designed for synthesis and allows for the manipulation of infinite streams of arbitrary (even non-enumerative, or higher order) type. It provides a straightforward notation to specify how outputs are computed from inputs, while using an intuitive interface to access time. The main focus of TSL is to describe temporal control flow, while abstracting away concrete implementation details. This not only keeps the logic intuitive and simple, but also allows a user to identify problems in the control flow even without a concrete implementation at hand. In this way, the use of TSL scales up to any required abstraction, such as API calls or complex algorithmic transformations.

*Architecture.* A TSL formula ϕ specifies a reactive system that in every time step processes a finite number of inputs I and produces a finite number of outputs O. Furthermore, it uses cells C to store a value computed at time t, which can then be reused in the next time step t + 1. An overview of the architecture of such a system is given in Fig. 4a. In terms of behavior, the environment produces infinite

**Fig. 4.** General architecture of reactive systems that are specified in TSL on the left, and the structure of function, predicate and updates on the right.

streams of input data, while the system uses pure (side-effect free) functions to transform the values of these input streams in every time step. After their transformation, the data values are either passed to an output stream or are passed to a cell, which pipes the output value from one time step back to the corresponding input value of the next. The behaviour of the system is captured by its infinite execution over time.

*Function Terms, Predicate Terms, and Updates.* In TSL we differentiate between two elements: we use purely functional transformations, reflected by functions f ∈ F and their compositions, and predicates p ∈ P, used to control how data flows inside the system. To argue about both elements we use a term based notation, where we distinguish between function terms τ<sup>F</sup> and predicate terms τ<sup>P</sup> , respectively. Function terms are either constructed from inputs or cells (s<sup>i</sup> <sup>∈</sup> <sup>I</sup> <sup>∪</sup> <sup>C</sup>), or from functions, recursively applied to a set of function terms. Predicate terms are constructed similarly, by applying a predicate to a set of function terms. Finally, an update takes the result of a function computation and passes it either to an output or a cell (s<sup>o</sup> <sup>∈</sup> <sup>O</sup> <sup>∪</sup> <sup>C</sup>). An overview of the syntax of the different term notations is given in Fig. 4b. Note that we use curried argument notation similar to functional programming languages.

We denote sets of function and predicate terms, and updates by T<sup>F</sup> , T<sup>P</sup> and <sup>T</sup>U, respectively, where <sup>T</sup><sup>P</sup> ⊆ T<sup>F</sup> . We use <sup>F</sup> to denote the set of function literals and <sup>P</sup> <sup>⊆</sup> <sup>F</sup> to denote the set of predicate literals, where the literals <sup>s</sup>i, <sup>s</sup>o, <sup>f</sup> and p are symbolic representations of inputs and cells, outputs and cells, functions, and predicates, respectively. Literals are used to construct terms as shown in Fig. 4b. Since we use a symbolic representation, functions and predicates are not tied to a specific implementation. However, we still classify them according to their arity, i.e., the number of function terms they are applied to, as well as by their type: input, output, cell, function or predicate. Furthermore, terms can be compared syntactically using the equivalence relation ≡. To assign a semantic interpretation to functions, we use an assignment function ·: <sup>F</sup> → F.

*Inputs, Outputs, and Computations.* We consider momentary inputs <sup>i</sup> ∈ V[I] , which are assignments of inputs <sup>i</sup> <sup>∈</sup> <sup>I</sup> to values <sup>v</sup> ∈ V. For the sake of readability let <sup>I</sup> <sup>=</sup> <sup>V</sup>[I] . Input streams are infinite sequences <sup>ι</sup> ∈ I<sup>ω</sup> consisting of infinitely many momentary inputs.

Similarly, a momentary output <sup>o</sup> ∈ V[O] is an assignment of outputs <sup>o</sup> <sup>∈</sup> <sup>O</sup> to values <sup>v</sup> ∈ V, where we also use <sup>O</sup> <sup>=</sup> <sup>V</sup>[O] . Output streams are infinite sequences  ∈ Oω. To capture the behavior of a cell, we introduce the notion of a computation ς. A computation fixes the function terms that are used to compute outputs and cell updates, without fixing semantics of function literals. Intuitively, a computation only determines which function terms are used to compute an output, but abstracts from actually computing it.

The basic element of a computation is a computation step <sup>c</sup> ∈ T [O∪C] <sup>F</sup> , which is an assignment of outputs and cells <sup>s</sup><sup>o</sup> <sup>∈</sup> <sup>O</sup> <sup>∪</sup> <sup>C</sup> to function terms <sup>τ</sup><sup>F</sup> ∈ T<sup>F</sup> . For the sake of readability let <sup>C</sup> <sup>=</sup> <sup>T</sup> [O∪C] <sup>F</sup> . A computation step fixes the control flow behaviour at a single point in time. A computation <sup>ς</sup> ∈ C<sup>ω</sup> is an infinite sequence of computation steps.

As soon as input streams, and function and predicate implementations are known, computations can be turned into output streams. To this end, let ·: <sup>F</sup> → F be some function assignment. Furthermore, assume that there are predefined constants *init* <sup>c</sup> ∈F∩V for every cell <sup>c</sup> <sup>∈</sup> <sup>C</sup>, which provide an initial value for each stream at the initial point in time. To receive an output stream from a computation <sup>ς</sup> ∈ C<sup>ω</sup> under the input stream <sup>ι</sup>, we use an evaluation function η-·: <sup>C</sup><sup>ω</sup> × I<sup>ω</sup> <sup>×</sup> <sup>N</sup> × T<sup>F</sup> → V:

$$\eta\_{\langle\rangle}(\varsigma,\iota,t,\mathbf{s\_1}) = \begin{cases} \iota(t)(\mathbf{s\_i}) & \text{if } \mathbf{s\_i} \in \mathbb{I} \\ init\_{\mathbf{s\_i}} & \text{if } \mathbf{s\_i} \in \mathbb{C} \wedge t = 0 \\ \eta\_{\langle\rangle}(\varsigma,\iota,t-1,\varsigma(t-1)(\mathbf{s\_i})) & \text{if } \mathbf{s\_i} \in \mathbb{C} \wedge t > 0 \end{cases}$$
 
$$\eta\_{\langle\rangle}(\varsigma,\iota,t,\mathbf{f}\ \tau\_0 \ \cdots \ \tau\_{m-1}) = \langle\mathbf{f}\rangle \ \eta\_{\langle\rangle}(\varsigma,\iota,t,\tau\_0) \ \cdots \ \eta\_{\langle\rangle}(\varsigma,\iota,t,\tau\_{m-1})$$

Then ·,ς,ι ∈ O<sup>ω</sup> is defined via ·,ς,ι(t)(o) = <sup>η</sup>-·(ς, ι, t, <sup>o</sup>) for all <sup>t</sup> <sup>∈</sup> <sup>N</sup>, <sup>o</sup> <sup>∈</sup> <sup>O</sup>. *Syntax.* Every TSL formula ϕ is built according to the following grammar:

$$
\varphi \quad := \quad \tau \in \mathcal{T}\_P \cup \mathcal{T}\_U \; | \quad \neg \varphi \; | \quad \varphi \land \varphi \; | \quad \mathsf{O} \varphi \; | \quad \varphi \mathcal{U} \varphi \; |
$$

An atomic proposition τ consists either of a predicate term, serving as a Boolean interface to the inputs, or of an update, enforcing a respective flow at the current point in time. Next, we have the Boolean operations via negation and conjunction, that allow us to express arbitrary Boolean combinations of predicate evaluations and updates. Finally, we have the temporal operator next: ψ, to specify the behavior at the next point in time and the temporal operator until: ϑ U ψ, which enforces a property ϑ to hold until the property ψ holds, where ψ must hold at some point in the future eventually.

*Semantics.* Formally, this leads to the following semantics. Let ·: <sup>F</sup> → F, <sup>ι</sup> ∈ Iω, and <sup>ς</sup> ∈ C<sup>ω</sup> be given, then the validity of a TSL formula <sup>ϕ</sup> with respect to <sup>ς</sup> and <sup>ι</sup> is defined inductively over <sup>t</sup> <sup>∈</sup> <sup>N</sup> via:

$$\begin{array}{lcll}\hline\\\varsigma,\iota,t &\models\_{\langle\rangle}\mathtt{p}\ \tau\_{0}\ \cdots \ \tau\_{m-1}\ \vdots \ \Leftrightarrow\ \eta\_{\langle\rangle}(\varsigma,\iota,t,\mathtt{p}\ \tau\_{0}\ \cdots \ \tau\_{m-1})\\\varsigma,\iota,t &\models\_{\langle\rangle}\mathtt{\ \mathsf{s}\leftarrow\tau\ \mathsf{f}}\end{array}\\\begin{array}{lcll}\kappa,\iota,t &\models\_{\langle\rangle}\mathtt{\ \mathsf{s}\leftarrow\tau\ \mathsf{f}}\ &\displaystyle\iff\ \zeta(t)(\mathtt{s})\equiv\tau\\\kappa,\iota,t &\models\_{\langle\rangle}\neg\psi\\\ \varsigma,\iota,t &\models\_{\langle\rangle}\vartheta\ \wedge\psi\\\ \varsigma,\iota,t &\models\_{\langle\rangle}\mathtt{\ \mathsf{O}}\ \psi\\\ \varsigma,\iota,t &\models\_{\langle\rangle}\vartheta\ \mathsf{U}\ \psi\\\ \end{array}\qquad\begin{array}{lcll}\#\ \mathsf{s},\iota\end{array}\#\ \begin{array}{lcll}\#\ \mathsf{s}\mu\\\ \end{array}\#\ \begin{array}{lcll}\#\ \mathsf{s}\mu\\\ \end{array}\#\ \begin{array}{lcll}\#\ \mathsf{s}\mu\\\ \end{array}\#\ \begin{array}{lcll}\#\ \mathsf{s}\mu\\\ \end{array}\#\ \begin{array}{lcll}\#\ \mathsf{s}\mu\\\ \end{array}\#\ \end{array}$$

Consider that the satisfaction of a predicate depends on the current computation step and the steps of the past, while for updates it only depends on the current computation step. Furthermore, updates are only checked syntactically, while the satisfaction of predicates depends on the given assignment · and the input stream ι. We say that ς and ι satisfy ϕ, denoted by ς,ι · ϕ, if ς, ι, 0 · ϕ.

Beside the basic operators, we have the standard derived Boolean operators, as well as the derived temporal operators: *release* ϕ R ψ ≡ ¬((¬ψ) U(¬ϕ)), *finally* ϕ ≡ *true* U ϕ, *always* ϕ ≡ *false* R ϕ, the *weak* version of *until* <sup>ϕ</sup> <sup>W</sup> <sup>ψ</sup> <sup>≡</sup> (<sup>ϕ</sup> <sup>U</sup> <sup>ψ</sup>) <sup>∨</sup> ( <sup>ϕ</sup>), and *as soon as* <sup>ϕ</sup> <sup>A</sup> <sup>ψ</sup> ≡ ¬<sup>ψ</sup> <sup>W</sup>(<sup>ψ</sup> <sup>∧</sup> <sup>ϕ</sup>).

*Realizability.* We are interested in the following realizability problem: given a TSL formula <sup>ϕ</sup>, is there a strategy <sup>σ</sup> ∈ C[I+] such that for every input <sup>ι</sup> ∈ I<sup>ω</sup> and function implementation ·: <sup>F</sup> → F, the branch <sup>σ</sup><sup>ι</sup> satisfies <sup>ϕ</sup>, i.e.,

$$
\exists \sigma \in \mathcal{C}^{[\mathcal{T}^+]} . \ \forall \iota \in \mathcal{T}^\omega . \ \forall \langle \cdot \rangle \colon \mathbb{F} \to \mathcal{F} . \ \sigma \wr \iota . \ \models\_{\langle \cdot \rangle} \varphi
$$

If such a strategy σ exists, we say σ realizes ϕ. If we additionally ask for a concrete instantiation of σ, we consider the synthesis problem of TSL.

#### **5 TSL Properties**

In order to synthesize programs from TSL specifications, we give an overview of the first part of our synthesis process, as shown in Fig. 1. First we show how to approximate the semantics of TSL through a reduction to LTL. However, due to the approximation, finding a realizable strategy immediately may fail. Our solution is a CEGAR loop that improves the approximation. This CEGAR loop is necessary, because the realizability problem of TSL is undecidable in general.

*Approximating TSL with LTL.* We approximate TSL formulas with weaker LTL formulas. The approximation reinterprets the syntactic elements, T<sup>P</sup> and TU, as atomic propositions for LTL. This strips away the semantic meaning of the function application and assignment in TSL, which we reconstruct by later adding assumptions lazily to the LTL formula.

Formally, let T<sup>P</sup> and T<sup>U</sup> be the finite sets of predicate terms and updates, which appear in ϕ*TSL*, respectively. For every assigned signal, we partition T<sup>U</sup> into so∈O∪<sup>C</sup> <sup>T</sup> so <sup>U</sup> . For every <sup>c</sup> <sup>∈</sup> <sup>C</sup> let <sup>T</sup> <sup>c</sup> U/id <sup>=</sup> <sup>T</sup> <sup>c</sup> <sup>U</sup> ∪ {c <sup>c</sup>}, for <sup>o</sup> <sup>∈</sup> <sup>O</sup> let

$$\begin{array}{ccc} \Box \left( \left[ \mathbf{y} \leftarrow \mathbf{y} \right] \vee \left[ \mathbf{y} \leftarrow \mathbf{x} \right] \right) & \stackrel{\scriptstyle \Box}{\qquad} \neg \left( \mathbf{y}\_{\text{-to-}} \mathbf{y} \wedge \mathbf{x}\_{\text{-to-}} \mathbf{y} \right) \\ \wedge \bigotimes \mathbf{p} \,\mathbf{x} \rightarrow \bigotimes \mathbf{p} \,\mathbf{y} & \stackrel{\scriptstyle \Box}{\qquad} \wedge \bigotimes \mathbf{p} \,\mathbf{p} \,\mathbf{x} \rightarrow \bigotimes \mathbf{p} \,\mathbf{p} \,\mathbf{y} \\ \wedge \bigotimes \mathbf{p} \,\mathbf{p} \,\mathbf{x} \rightarrow \bigotimes \mathbf{p} \,\mathbf{p} \,\mathbf{y} \end{array}$$

**Fig. 5.** A TSL specification (a) with input x and cell y that is realizable. A winning strategy is to save x to y as soon as p(x) is satisfied. However, the initial approximation (b), that is passed to an LTL synthesis solver, is unrealizable, as proven through the counter-strategy (c) returned by the LTL solver.

T o U/id <sup>=</sup> <sup>T</sup> <sup>o</sup> <sup>U</sup> , and let TU/id = so∈O∪<sup>C</sup> <sup>T</sup> so U/id. We construct the LTL formula ϕ*LTL* over the input propositions T<sup>P</sup> and output propositions TU/id as follows:

$$\varphi\_{\mathsf{LTL}} = \mathsf{T} \left( \bigwedge\_{\mathsf{a\_{\mathsf{o}}} \in \mathsf{U} \cup \mathsf{C}} \bigvee\_{\tau \in T\_{\mathsf{U}/\mathsf{t}\mathsf{d}}^{\mathsf{a\_{\mathsf{o}}}}} \left( \tau \wedge \bigwedge\_{\tau' \in T\_{\mathsf{U}/\mathsf{t}\mathsf{d}}^{\mathsf{a\_{\mathsf{o}}}}} \neg \tau' \right) \right) \wedge \text{SYNTACTICONERSION} \{ \varphi\_{T \circ \mathsf{S}} \} $$

Intuitively, the first part of the equation partially reconstructs the semantic meaning of updates by ensuring that a signal is not updated with multiple values at a time. The second part extracts the reactive constraints of the TSL formula without the semantic meaning of functions and updates.

**Theorem 1 (**[19]**).** *If* ϕ*LTL is realizable, then* ϕ*TSL is realizable.*

Note that unrealizability of ϕ*LTL* does not imply that ϕ*TSL* is unrealizable. It may be that we have not added sufficiently many environment assumptions to the approximation in order for the system to produce a realizing strategy.

*Example.* As an example, we present a simple TSL specification in Fig. 5a. The specification asserts that the environment provides an input x for which the predicate p x will be satisfied eventually. The system must guarantee that eventually p y holds. According to the semantics of TSL the formula is realizable. The system can take the value of x when p x is true and save it to y, thus guaranteeing that p y is satisfied eventually. This is in contrast to LTL, which has no semantics for pure functions - taking the evaluation of p y as an environmentally controlled value that does not need to obey the consistency of a pure function.

*Refining the LTL Approximation.* It is possible that the LTL solver returns a counter-strategy for the environment although the original TSL specification is realizable. We call such a counter-strategy *spurious* as it exploits the additional freedom of LTL to violate the purity of predicates as made possible by the underapproximation. Formally, a counter-strategy is an infinite tree π : C<sup>∗</sup> → 2<sup>T</sup>*<sup>P</sup>* , which provides predicate evaluations in response to possible update assignments of function terms <sup>τ</sup><sup>F</sup> ∈ T<sup>F</sup> to outputs <sup>o</sup> <sup>∈</sup> <sup>O</sup>. W.l.o.g. we can assume that <sup>O</sup>, <sup>T</sup><sup>F</sup> and T<sup>P</sup> are finite, as they can always be restricted to the outputs and terms that appear in the formula. A counter-strategy is spurious, iff there is a branch πς for some computation <sup>ς</sup> ∈ C<sup>ω</sup>, for which the strategy chooses an inconsistent evaluation of two equal predicate terms at different points in time, i.e.,


$$\begin{array}{lcl} \exists \varsigma \in \mathcal{C}^{\omega}. \,\,\exists t, t' \in \mathbb{N}. \,\,\exists \tau\_{P} \in \mathcal{T}\_{P}. \\ \tau\_{P} \in \pi(\varsigma(0)\varsigma(1)\ldots\varsigma(t-1)) \wedge \tau\_{P} \notin \pi(\varsigma(0)\varsigma(1)\ldots\varsigma(t'-1)) \wedge \\ \forall \langle \cdot \rangle \colon \mathbb{F} \to \mathcal{F}. \,\eta\_{\langle\rangle}(\varsigma, \pi \wr \varsigma, t, \tau\_{P}) = \eta\_{\langle\rangle}(\varsigma, \pi \wr \varsigma, t', \tau\_{P}). \end{array}$$

Note that a non-spurious strategy can be inconsistent along multiple branches. Due to the definition of realizability the environment can choose function and predicate assignments differently against every system strategy accordingly.

By purity of predicates in TSL the environment is forced to always return the same value for predicate evaluations on equal values. However, this semantic property cannot be enforced implicitly in LTL. To resolve this issue we use the returned counter-strategy to identify spurious behavior in order to strengthen the LTL underapproximation with additional environment assumptions. After adding the derived assumptions, we re-execute the LTL synthesizer to check whether the added assumptions are sufficient in order to obtain a winning strategy for the system. If the solver still returns a spurious strategy, we continue the loop in a CEGAR fashion until the set of added assumptions is sufficiently complete. However, if a non-spurious strategy is returned, we have found a proof that the given TSL specification is indeed unrealizable and terminate.

Algorithm 1 shows how a returned counter-strategy π is checked for being spurious. To this end, it is sufficient to check π against system strategies bounded by the given bound b, as we use bounded synthesis [20]. Furthermore, we can assume w.l.o.g. that π is given by a finite state representation, which is always possible due to the finite model guarantees of LTL. Also note that π, as it is returned by the LTL synthesizer, responds to sequences of sets of updates (2<sup>T</sup>*U/*id )∗. However, in our case (2<sup>T</sup>*U/*id )<sup>∗</sup> is an alternative representation of C<sup>∗</sup>, due to the additional "single update" constraints added during the construction of ϕ*LTL*.

The algorithm iterates over all possible responses <sup>v</sup> ∈ C<sup>m</sup>·<sup>b</sup> of the system up to depth m · b. This is sufficient, since any deeper exploration would result in a state repetition of the cross-product of the finite state representation of π and any system strategy bounded by b. Hence, the same behaviour could also be generated by a sequence smaller than m · b. At the same time, the algorithm iterates over predicates τ<sup>P</sup> ∈ T<sup>P</sup> appearing in ϕ*TSL* and times t and t smaller than m · b. For each of these elements, spuriousness is checked by comparing the output of π for the evaluation of τ<sup>P</sup> at times t and t , which should only differ if the inputs to the predicates are different as well. This can only happen, if the passed input terms have been constructed differently over the past. We check it by using the evaluation function η equipped with the identity assignment ·id : <sup>F</sup> <sup>→</sup> <sup>F</sup>, with <sup>f</sup>id <sup>=</sup> <sup>f</sup> for all <sup>f</sup> <sup>∈</sup> <sup>F</sup>, and the input sequence <sup>ι</sup>id, with <sup>ι</sup>id(t)(i)=(t, <sup>i</sup>) for all <sup>t</sup> <sup>∈</sup> <sup>N</sup> and <sup>i</sup> <sup>∈</sup> <sup>I</sup>, that always generates a fresh input. Syntactic inequality of η-·id (v, ιid, t, τ<sup>P</sup> ) and η-·id (v, ιid, t , τ<sup>P</sup> ) then is a sufficient condition for the existence of an assignment ·: <sup>F</sup> → F, for which <sup>τ</sup><sup>P</sup> evaluates differently at times t and t .

If spurious behaviour of π could be found, then the revealing response v ∈ C<sup>∗</sup> is first simplified using reduce, which reduces v again to a sequence of sets of updates w ∈ (2<sup>T</sup>*U/*id )<sup>∗</sup> and removes updates that do not affect the behavior of τ<sup>P</sup> at the times t and t to accelerate the termination of the CEGAR loop. Afterwards, the sequence w is turned into a new assumption that prohibits the spurious behavior, generalized to prevent it even at arbitrary points in time.

As an example of this process, reconsider the spurious counter-strategy of Fig. 5c. Already after the first system response y x, the environment produces an inconsistency by evaluating p x and p y differently. This is inconsistent, as the cell y holds the same value at time t = 1 as the input x at time t = 0. Using Algorithm 1 we generate the new assumption (y x → (p x ↔ p y)). After adding this strengthening the LTL synthesizer returns a realizability result.

*Undecidability.* Although we can approximate the semantics of TSL with LTL, there are TSL formulas that cannot be expressed as LTL formulas of finite size.

**Theorem 2 (**[19]**).** *The realizability problem of TSL is undecidable.*

#### **6 TSL Synthesis**

Our synthesis framework provides a modular refinement process to synthesize executables from TSL specifications, as depicted in Fig. 1. The user initially provides a TSL specification over predicate and function terms. At the end of the procedure, the user receives an executable to control a reactive system.

The first step of our method answers the synthesis question of TSL: if the specification is realizable, then a control flow model is returned. To this end, an intermediate translation to LTL is used, utilizing an LTL synthesis solver that produces circuits in the AIGER format. If the specification is realizable, the resulting control flow model is turned into Haskell code, which is implemented as an independent Haskell module. The user has the choice between two different targets: a module built on Arrows, which is compatible with any Arrowized FRP library, or a module built on Applicative, which supports Applicative FRP libraries. Our procedure generates a single Haskell module per TSL specification. This makes naturally decomposing a project according to individual tasks possible. Each module provides a single component, which is parameterized by their initial state and the pure function and predicate transformations. As soon as these are provided as part of the surrounding project context, a final executable can be generated by compiling the Haskell code.

An important feature of our synthesis approach is that implementations for the terms used in the specification are only required after synthesis. This allows

**Fig. 6.** Example CFM of the music player generated from a TSL specification.

the user to explore several possible specifications before deciding on any term implementations.

*Control Flow Model.* The first step of our approach is the synthesis of a *Control Flow Model* M (CFM) from the given TSL specification ϕ, which provides us with a uniform representation of the control flow structure of our final program.

Formally, a CFM <sup>M</sup> is a tuple <sup>M</sup> = (I, <sup>O</sup>, <sup>C</sup>, V, , δ), where <sup>I</sup> is a finite set of inputs, O is a finite set of outputs, C is a finite set of cells, V is a finite set of vertices, : <sup>V</sup> <sup>→</sup> <sup>F</sup> assigns a vertex a function <sup>f</sup> <sup>∈</sup> <sup>F</sup> or a predicate <sup>p</sup> <sup>∈</sup> <sup>P</sup>, and

$$\delta \colon (\mathbb{O} \cup \mathbb{C} \cup V) \times \mathbb{N} \to (\mathbb{I} \cup \mathbb{C} \cup V \cup \{\bot\})$$

is a dependency relation that relates every output, cell, and vertex of the CFM with <sup>n</sup> <sup>∈</sup> <sup>N</sup> arguments, which are either inputs, cells, or vertices. Outputs and cells <sup>s</sup> <sup>∈</sup> <sup>O</sup>∪<sup>C</sup> always have only a single argument, i.e., <sup>δ</sup>(s, 0) ≡ ⊥ and <sup>∀</sup>m > <sup>0</sup>. <sup>δ</sup>(s, m) ≡ ⊥, while for vertices <sup>x</sup> <sup>∈</sup> <sup>V</sup> the number of arguments <sup>n</sup> <sup>∈</sup> <sup>N</sup> align with the arity of the assigned function or predicate (x), i.e., <sup>∀</sup><sup>m</sup> <sup>∈</sup> <sup>N</sup>. δ(x, m) ≡ ⊥ ↔ m>n. A CFM is valid if it does not contain circular dependencies, i.e., on every cycle induced by δ there must lie at least a single cell. We only consider valid CFMs.

An example CFM for our music player of Sect. 2 is depicted in Fig. 6. Inputs I come from the left and outputs O leave on the right. The example contains a single cell <sup>c</sup> <sup>∈</sup> <sup>C</sup>, which holds the stateful memory Cell, introduced during synthesis for the module. The green, arrow shaped boxes depict vertices V , which are labeled with functions and predicates names, according to . For the Boolean decisions that define δ, we use circuit symbols for conjunction, disjunction, and negation. Boolean decisions are piped to a multiplexer gate that selects the respective update streams. This allows each update stream to be passed to an output stream if and only if the respective Boolean trigger evaluates positively, while our construction ensures mutual exclusion on the Boolean triggers. For code generation, the logic gates are implemented using the corresponding dedicated Boolean functions. After building a control structure, we assign semantics to functions and predicates by providing implementations. To this end, we use Functional Reactive Programming (FRP). Prior work has established Causal Commutative Arrows (CCA) as an FRP language pattern equivalent to a CFM [33,34,53]. CCAs are an abstraction subsumed by other functional reactive programming abstractions, such as Monads, Applicative and Arrows [32,33]. There are many FRP libraries using Monads [11,14,42], Applicative [2,3,23,48], or Arrows [10,39,41,51], and since every Monad is also an Applicative and Applicative/Arrows both are universal design patterns, we can give uniform translations to all of these libraries using translations to just Applicative and Arrows. Both translations are possible due to the flexible notion of a CFM.

In the last step, the synthesized FRP program is compiled into an executable, using the provided function and predicate implementations. This step is not fixed to a single compiler implementation, but in fact can use any FRP compiler (or library) that supports a language abstraction at least as expressive as CCA. For example, instead of creating an Android music player app, we could target an FRP web interface [48] to create an online music player, or an embedded FRP library [23] to instantiate the player on a computationally more restricted device. By using the strong core of CCA, we even can directly implement the player in hardware, which is for example possible with the CλaSH compiler [3]. Note that we still need separate implementations for functions and predicates for each target. However, the specification and synthesized CFM always stay the same.

#### **7 Experimental Results**

To evaluate our synthesis procedure we implemented a tool that follows the structure of Fig. 1. It first encodes the given TSL specification in LTL and then refines it until an LTL solver either produces a realizability result or returns a non-spurious counter-strategy. For LTL synthesis we use the bounded synthesis tool BoSy [15]. As soon as we get a realizing strategy it is translated to a corresponding CFM. Then, we generate the FRP program structure. Finally, after providing function implementations the result is compiled into an executable.

To demonstrate the effectiveness of synthesizing TSL, we applied our tool to a collection of benchmarks from different application domains, listed in Table 1. Every benchmark class consists of multiple specifications, addressing different features of TSL. We created all specifications from scratch, where we took care that they either relate to existing textual specifications, or real world scenarios. A short description of each benchmark class is given in [19].

For every benchmark, we report the synthesis time and the size of the synthesized CFM, split into the number of cells (|CM|) and vertices (|VM|) used. The synthesized CFM may use more cells than the original TSL specification if synthesis requires more memory in order to realize a correct control flow.


**Table 1.** Number of cells <sup>|</sup>CM<sup>|</sup> and vertices <sup>|</sup>VM<sup>|</sup> of the resulting CFM <sup>M</sup> and synthesis times for a collection of TSL specifications ϕ. A \* indicates that the benchmark additionally has an initial condition as part of the specification.

**Table 2.** Set of programs that use purity to keep one or two counters in range. Synthesis needs multiple refinements of the specification to proof realizability.


The synthesis was executed on a quad-core Intel Xeon processor (E3-1271 v3, 3.6GHz, 32 GB RAM, PC1600, ECC), running Ubuntu 64bit LTS 16.04.

The experiments of Table 1 show that TSL successfully lifts the applicability of synthesis from the Boolean domain to arbitrary data domains, allowing for new applications that utilize every level of required abstraction. For all benchmarks we always found a realizable system within a reasonable amount of time, where the results often required synthesized cells to realize the control flow behavior.

We also considered a preliminary set of benchmarks that require multiple refinement steps to be synthesizable. An overview of the results is given in Table 2. The benchmarks are inspired by examples of the Reactive Banana FRP library [2]. Here, purity of function and predicate applications must be utilized by the system to ensure that the value of one or two counters never goes out of range. Thereby, the system not only needs purity to verify this condition, but also to take the correct decisions in the resulting implementation to be synthesized.

#### **8 Related Work**

Our approach builds on the rich body of work on reactive synthesis, see [17] for a survey. The classic reactive synthesis problem is the construction of a finite-state machine that satisfies a specification in a temporal logic like LTL. Our approach differs from the classic problem in its connection to an actual programming paradigm, namely FRP, and its separation of control and data.

The synthesis of *reactive programs*, rather than finite-state machines, has previously been studied for standard temporal logic [21,35]. Because there is no separation of control and data, these approaches do not directly scale to realistic applications. With regard to FRP, a *Curry-Howard correspondence* between LTL and FRP in a dependently typed language was discovered [28,29] and used to prove properties of FRP programs [8,30]. However, our paper is the first, to the best of our knowledge, to study the synthesis of FRP programs from temporal specifications.

The idea to separate control and data has appeared, on a smaller scale, in the synthesis with *identifiers*, where identifiers, such as the number of a client in a mutual exclusion protocol, are treated symbolically [13]. *Uninterpreted functions* have been used to abstract data-related computational details in the synthesis of synchronization primitives for complex programs [5]. Another connection to other synthesis approaches is our CEGAR loop. Similar *refinement loops* also appear in other synthesis appraches, however with a different purpose, such as the refinement of environment assumptions [1].

So far, there is no immediate connection between our approach and the substantial work on *deductive* and *inductive synthesis*, which is specifically concerned with the data-transformation aspects of programs [16,31,40,47,49,50]. Typically, these approaches are focussed on non-reactive sequential programs. An integration of deductive and inductive techniques into our approach for reactive systems is a very promising direction for future work. Abstraction-based synthesis [4,12,24,37] may potentially provide a link between the approaches.

#### **9 Conclusions**

We have introduced Temporal Stream Logic, which allows the user to specify the control flow of a reactive program. The logic cleanly separates control from complex data, forming the foundation for our procedure to synthesize FRP programs. By utilizing the purity of function transformations our logic scales independently of the complexity of the data to be handled. While we have shown that scalability comes at the cost of undecidability, we addressed this issue by using a CEGAR loop, which lazily refines the underapproximation until either a realizing system implementation or an unrealizability proof is found.

Our experiments indicate that TSL synthesis works well in practice and on a wide range of programming applications. TSL also provides the foundations for further extensions. For example, a user may want to fix the semantics for a subset of the functions and predicates. Such refinements can be implemented as part of a much richer *TSL Modulo Theory* framework.

#### **References**


53. Yallop, J., Liu, H.: Causal commutative arrows revisited. In: Mainland [36], pp. 21–32. https://doi.org/10.1145/2976002.2976019, http://doi.acm.org/10.1145/ 2976002.2976019

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Run-Time Optimization for Learned Controllers Through Quantitative Games**

Guy Avni1(B) , Roderick Bloem2, Krishnendu Chatterjee1, Thomas A. Henzinger1, Bettina Konighofer ¨ 2, and Stefan Pranger2

> <sup>1</sup> IST Austria, Klosterneuburg, Austria guy.avni@ist.ac.at <sup>2</sup> TU Graz, Graz, Austria

**Abstract.** A controller is a device that interacts with a plant. At each time point, it reads the plant's state and issues commands with the goal that the plant operates optimally. Constructing optimal controllers is a fundamental and challenging problem. Machine learning techniques have recently been successfully applied to train controllers, yet they have limitations. Learned controllers are monolithic and hard to reason about. In particular, it is difficult to add features without retraining, to guarantee any level of performance, and to achieve acceptable performance when encountering untrained scenarios. These limitations can be addressed by deploying quantitative run-time *shields* that serve as a proxy for the controller. At each time point, the shield reads the command issued by the controller and may choose to alter it before passing it on to the plant. We show how optimal shields that interfere as little as possible while guaranteeing a desired level of controller performance, can be generated systematically and automatically using reactive synthesis. First, we abstract the plant by building a stochastic model. Second, we consider the learned controller to be a black box. Third, we measure *controller performance* and *shield interference* by two quantitative run-time measures that are formally defined using weighted automata. Then, the problem of constructing a shield that guarantees maximal performance with minimal interference is the problem of finding an optimal strategy in a stochastic 2-player game "controller versus shield" played on the abstract state space of the plant with a quantitative objective obtained from combining the performance and interference measures. We illustrate the effectiveness of our approach by automatically constructing lightweight shields for learned traffic-light controllers in various road networks. The shields we generate avoid liveness bugs, improve controller performance in untrained and changing traffic situations, and add features to learned controllers, such as giving priority to emergency vehicles.

#### **1 Introduction**

The *controller synthesis* problem is a fundamental problem that is widely studied by different communities [42,44]. A controller is a device that interacts with a *plant*. In each point in time it reads the plant's state, e.g., given by sensor reading, and issues

This research was supported in part by the Austrian Science Fund (FWF) under grants S114 (RiSE/SHiNE), Z211-N23 (Wittgenstein Award), and M 2369-N33 (Meitner fellowship).

a command based on the state. The controller should guarantee that the plant operates correctly or optimally with respect to some given specification. As a running example, we consider a traffic light controller for a road intersection (see Fig. 1). The state of the plant refers to the state of the roads leading to the junction; namely, the positions of the cars, their speeds, their sizes, etc. A controller command consists of a light configuration for the junction in the next time frame. Specifications can either be qualitative, e.g., "it should never be the case that a road with an empty queue gets a green light", or quantitative, e.g., "the cost of a controller is the average waiting times of the cars in the junction".

**Fig. 1.** On the left, a concrete state depicted in the traffic simulator SUMO. On the right, we depict the corresponding abstract state with queues cut off at k = 5, and some outgoing transitions. Upon issuing action North-South, a car is evicted from each of the North-South queues. Then, we choose uniformly at random, out of the 16 possible options, the incoming cars to the queues, update the state, and cutoff the queues at k (e.g., when a car enters from East, the queue stays 5).

A challenge in controller synthesis is that, since the number of possible plant readings is huge, it is computationally demanding to find an optimal command, given a plant state. Machine learning is a prominent approach to make decisions based on large amounts of collected data [28,37]. It is widely successful in practice and takes an integral part in the design process of various systems. Machine learning has been successfully applied to train controllers [15,33,34] and specifically controllers for traffic control [20,35,39].

A shortcoming of machine-learning techniques is that the controllers that are produced are black-box devices that are hard to reason about and modify without a complete re-training. It is thus challenging, for example, to obtain worst-case guarantees about the controller, which is particularly important in safety-critical settings. Attempts to address this problem come from both the formal methods community [46], where verification of learned systems is extensively studied [24,29], and the machine-learning community, where guarantees are added during the training process using reward engineering [13,18] or by modifying the exploration process [11,19,38]. Both approaches require expertise in the respective field and suffer from limitations such as scalability for the first, and intricacy and robustness issues, for the second. Moreover, both techniques were mostly studied for safety properties.

Another shortcoming of machine-learning techniques is that they require expertise and a fine-tuning of parameters. It is difficult, for example, to train controllers that are robust to plant behaviors, e.g., a controller that has been trained on uniform traffic congestion meeting rush-hour traffic, which can be significantly different and can cause poor performance. Also, it is challenging to add features to a controller without retraining, which is both costly and time consuming. These can include permanent features, e.g., priority to public transport, or temporary changes, e.g., changes due to an accident or construction. Again, since the training process is intricate, adding features during training can have unexpected effects.

In this work, we use quantitative *shields* to deal with the limitations of learned or any other black-box controllers. A shield serves as a proxy between the controller and the plant. In each point in time, as before, the controller reads the state of the plant and issues a command. Rather than directly feeding the command to the plant, the shield first reads it along with an abstract plant state. The shield can then choose to keep the controller's command or alter it, before issuing the command to the plant. The concept of shields was first introduced in [30], where shields for safety properties were considered and with a qualitative notion of interference: a shield is only allowed to interfere when a controller error occurs, which is only well-defined when considering safety properties. We elaborate on other shield-like approaches in the Sect. 1.1.

Our goal is to automatically synthesize shields that optimize quantitative measures for black-box controllers. We are interested in synthesizing lightweight shields. We assume that the controller performs well on average, but has no worst-case guarantees. When combining the shield and the controller, intuitively, the controller should be active for the majority of the time and the shield intervenes only when it is required. We formalize the plant behavior as well as the interference cost using quantitative measures. Unlike safety objectives, where it is clear when a shield must interfere, with quantitative objectives, a non-interference typically does not have a devastating effect. It is thus challenging to decide, at each time point, whether the shield should interfere or not; the shield needs to balance the cost of interfering with the decrease in performance of not interfering. Automatic synthesis of shields is thus natural in this setting.

We elaborate on the two quantitative measures we define. The interaction between the plant, controller, and shield gives rise to an infinite sequence over C ×Γ ×Γ, where C is a set of plant states and Γ is a set of allowed actions. A triple c, γ1, γ2 means that the plant is in state c, the controller issues command γ1, and the shield (possibly) alters it to γ2. We use *weighted automata* to assign costs to infinite traces, which have proven to be a convenient, flexible, and robust quantitative specification language [14]. Our *behavioral score* measures the performance of the plant and it is formally given by a weighted automaton that assigns scores to traces over C ×Γ. Boolean properties are a special case, which include *safety* properties, e.g., "an emergency vehicle should always get a green light", and *liveness*, e.g., "a car waiting in a queue eventually gets the green light". An example of a quantitative score is the long-run average of the waiting times of the vehicles in the city. A second score measures the *interference* of a shield with a controller. It is given by a weighted automaton over the alphabet Γ × Γ. A simple example of an interference score charges the shield 1 for every change of action and charges 0 when no change is made. Then, the score of an infinite trace can be phrased as the ratio of the time that the shield interferes. Using weighted automata we can specify more involved scores such as different charges for different types of alterations or even charges that depend on the past, e.g., altering the controller's command twice in a row is not allowed.

Given a probabilistic plant model and a formal specification of behavioral and interference scores, the problem of synthesizing an optimal shield is well-defined and can be solved by game theory. While the game-based techniques we use are those of discreteevent controller synthesis [3] in a stochastic setting with quantitative objectives, our set-up is quite different. In traditional controller synthesis, there are two entities; the controller and the adversarial plant. The goal is to synthesize a controller offline. In our setting, there are three entities: the plant, whose behavior we model probabilistically, the controller, which we treat as a black-box and model as an adversary, and the shield, which we synthesize. Note that the shield's synthesis procedure is done offline but it makes online decisions when it operates together with the controller and plant. Our plant model is formally given by a *Markov decision process* which is a standard model with which one models lack of knowledge about the plant using probability (see Fig. 1 and details in Example 1). The game is played on the MDP by two players; a shield and a controller, where the quantitative objective is given by the two scores. An optimal shield is then extracted from an optimal strategy for the shield player. The game we construct admits memoryless optimal strategies, thus the size of the shield is proportional to the size of the abstraction of the plant. In addition, it is implemented as a look-up table for actions in every state. Thus, the runtime overhead is a table look-up and hence negligible.

We experiment with our framework by constructing shields for traffic lights in a network of roads. Our experimental results illustrate the usefulness of the framework. We construct shields that consistently improve the performance of controllers, especially when exhibiting behavior that they are not trained on, but, more surprising, also while exhibiting trained behavior. We show that the use of a shield reduces variability in performance among various controllers, thus when using a shield, the choice of the parameters used in the training phase becomes less acute. We show how a shield can be used to add the functionality of prioritizing public transport as well as local fairness to a controller, both without re-training the controller. In addition, we illustrate how shields can add worst-case guarantees on liveness without a costly verification of the controller.

#### **1.1 Related Work**

A shield-like approach to adding safety to systems is called *runtime assurance* [47], and has applications, for example, in control of robotics [41] and drones [12]. In this framework, a switching mechanism alternates between running a high-performance system and a provably safe one. These works differ from ours since they consider safety specifications. As mentioned earlier, a challenge with quantitative specifications is that, unlike safety specifications, a non-interference typically does not have a devastating effect, thus it is not trivial to decide when and to what extent to interfere.

Another line of work is *runtime enforcement*, where an enforcer monitors a program that outputs events and can either terminate the program once it detects an error [45], or alter the event in order to guarantee, for example, safety [21], richer qualitative objectives [16], or privacy [26,49]. The similarities between an enforcer and a shield is in their ability to alter events. The settings are quite different, however, since the enforced program is not reactive whereas we consider a plant that receives commands.

Recently, formal approaches were proposed in order to restrict the exploration of the learning agent such that a set of logically constraints are always satisfied. This method can support other properties beyond safety, e.g., probabilistic computation tree logic (PCTL) [25,36], linear temporal logic (LTL) [1], or differential dynamic logic [17]. To the best of our knowledge, quantitative specifications have not yet been considered. Unlike these approaches, we consider the learned controller as a black box, thus our approach is particularly suitable for machine learning non-experts.

While MDPs and partially-observable MDPs have been widely studied in the literature w.r.t. to quantitative objectives [27,43], our framework requires the interaction of two players (the shield and the black-box controller) and we use game-theoretic framework with quantitative objectives for our solution.

#### **2 Definitions and Problem Statement**

#### **2.1 Plants, Controllers, and Shields**

The interaction with a *plant* over a concrete set of states C is carried out using two functionalities: PLANT.GETSTATE returns the plant's current state and PLANT.ISSUECOMMAND issues an action from a set Γ. Once an action is issued, the plant updates its state according to some unknown transition function. At each point in time, the *controller* reads the state of the plant and issues a command. Thus, it is a function from a history in (<sup>C</sup> <sup>×</sup> <sup>Γ</sup>)<sup>∗</sup> · <sup>C</sup> to <sup>Γ</sup>.

Informally, a *shield* serves as a proxy between the controller and the plant. In each time point, it reads the controller's issued action and can choose an alternative action to issue to the plant. We are interested in light-weight shields that add little or no overhead to the controller, thus the shield must be defined w.r.t. an abstraction of the plant, which we define formally below.

**Abstraction.** An abstraction is a *Markov decision process* (MDP, for short) is <sup>A</sup> = -Γ, A, a0, δ, where Γ is a set of actions, A is a set of abstract plant states, a<sup>0</sup> ∈ A is an initial state, and <sup>δ</sup> : <sup>A</sup>×<sup>Γ</sup> <sup>→</sup> [0, 1]<sup>A</sup> is a probabilistic transition function, i.e., for every a ∈ A and γ ∈ Γ, we have - a-<sup>∈</sup><sup>A</sup> <sup>δ</sup>(a, γ)(a )=1. The probabilities in the abstraction model our lack of knowledge of the plant, and we assume that they reflect the behavior exhibited by the plant. A *policy* f is a function from a finite history of states in A<sup>∗</sup> to the next action in <sup>Γ</sup>, thus it gives rise to a probabilistic distribution <sup>D</sup>(f) over infinite sequences over A.

*Example 1.* Consider a plant that represents a junction with four incoming directions (see Fig. 1). We describe an abstraction A for the junction that specifies how many cars are waiting in each queue, where we cut off the count at a parameter <sup>k</sup> <sup>∈</sup> <sup>N</sup>. Formally, an abstract state is a vector in {0,...,k}<sup>4</sup>, where the indices respectively represent the North, East, South, and West queues. The larger k is, the closer the abstraction is to the concrete plant. The set of possible actions represent the possible light directions in the junction {NS, EW}. The abstract transitions estimate the plant behavior, and we describe them in two steps. Consider an abstract state <sup>a</sup> = (a1, a2, a3, a4) and suppose the issued action is NS, where the case of EW is similar. We allow a car to cross the junction from each of the North and South queues and decrease the two queues. Let <sup>a</sup> = (max{0, a<sup>1</sup> <sup>−</sup> <sup>1</sup>}, a2, max{0, a<sup>3</sup> <sup>−</sup> <sup>1</sup>}, a4). Next, we probabilistically model incoming cars to the queues as follows. Consider a vector <sup>i</sup>1, i2, i3, i4∈{0, 1}<sup>4</sup> that represents incoming cars to the queues. Let <sup>a</sup> be such that, for <sup>1</sup> <sup>≤</sup> <sup>j</sup> <sup>≤</sup> <sup>4</sup>, we add <sup>i</sup><sup>j</sup> to the j-th queue and trim at k, thus a <sup>j</sup> = min{a <sup>j</sup> <sup>+</sup> <sup>i</sup><sup>j</sup> , k}. Then, in <sup>A</sup>, when performing action NS in <sup>a</sup>, we move to <sup>a</sup> with the uniform probability 1/16.

We define shields formally. Let Γ be a set of commands, M a set of memory states, <sup>C</sup> and <sup>A</sup> be a set of concrete and abstract states, respectively, and let <sup>α</sup> : <sup>C</sup> <sup>→</sup> <sup>A</sup> be a mapping between the two. A shield is a function SHIELD : <sup>A</sup> <sup>×</sup> <sup>M</sup> <sup>×</sup> <sup>Γ</sup> <sup>→</sup> <sup>Γ</sup> <sup>×</sup> <sup>M</sup> together with an initial memory state m<sup>0</sup> ∈ M. We use PLANT to refer to the plant, which, recall, has two functionalities: reading the current state and issuing a command from Γ. Let CONT be a controller, which has a single functionality: given a history of plant states, the controller issues the command to issue to the plant. The interaction of the components is captured in the following pseudo code:

```
m ← m0 ∈ M and π ← empty sequence.
while true do
   c ← PLANT.GETSTATE() ∈ C
   γ ← CONT.GETCOMMAND(π · c)
   a = α(c) ∈ A // generate abstract state for shield
   γ
     , m ← SHIELD(a, γ, m)
   PLANT.ISSUECOMMAND(γ
                            )
   m ← m // update shield memory state
   π ← π · -
           c, γ
                // update plant history
end while
```
#### **2.2 Quantitative Objectives for Shields**

We are interested in two types of performance measures for shields. The *behavioral measure* quantifies the quality of the plant's behavior when operated with a controller and shield. The *interference measure* quantifies the degree to which a shield interferes with the controller. Formally, we need to specify values for infinite sequences, and we use *weighted automata*, which are a convenient model to express such values.

**Weighted Automata.** A weighted automaton is a function from infinite strings to values. Technically, a weighted automaton is similar to a standard automaton only that the transitions are labeled, in addition to letters, with numbers (weights). Unlike standard automata in which a run is either accepting or rejecting, a run in a weighted automaton has a value. We focus on limit-average automata in which the value is the limit average of the running sum of weights that it traverses. Formally, a weighted automaton is <sup>W</sup> = -Σ, Q, q0, Δ, cost, where Σ is a finite alphabet, Q is a finite set of states, <sup>Δ</sup> <sup>⊆</sup> (<sup>Q</sup> <sup>×</sup> <sup>Σ</sup> <sup>×</sup> <sup>Q</sup>) is a deterministic transition relation, i.e., for every <sup>q</sup> <sup>∈</sup> <sup>Q</sup> and <sup>σ</sup> <sup>∈</sup> <sup>Σ</sup>, there is at most one <sup>q</sup> <sup>∈</sup> <sup>Q</sup> with <sup>Δ</sup>(q, σ, q ), and cost : <sup>Δ</sup> <sup>→</sup> <sup>Q</sup> specifies costs for transitions. A *run* of <sup>W</sup> on an infinite word <sup>σ</sup> = <sup>σ</sup>1, σ2,... is <sup>r</sup> <sup>=</sup> <sup>r</sup>0, r1,... <sup>∈</sup> <sup>Q</sup><sup>ω</sup>

such that <sup>r</sup><sup>0</sup> <sup>=</sup> <sup>q</sup><sup>0</sup> and, for <sup>i</sup> <sup>≥</sup> <sup>1</sup>, we have <sup>Δ</sup>(ri−1, σi, ri). Note that <sup>W</sup> is deterministic so there is at most one run on every word. The value that W assigns to σ is lim infn→∞ <sup>1</sup> n n <sup>i</sup>=1 cost(ri−1, σi, ri).

**Behavioral Score.** A *behavioral* score measures the quality of the behavior that the plant exhibits. It is given by a weighed automaton over the alphabet A × Γ, thus it assigns real values to infinite sequences over A × Γ. In our experiments, we use a *concrete* behavioral score, which assigns values to infinite sequences over C × Γ. We compare the performance of the plant with various controllers and shields w.r.t. the concrete score rather than the abstract score. With a weighted automaton we can express costs that change over time: for example, we can penalize traffic lights that change frequently.

**Interference Score.** The second score we consider measures the interference of the shield with the controller. An *interference* score is given by a weighted automaton over the alphabet Γ × Γ. With a weighted automaton we can express costs that change over time: for example, interfering once costs 1 and any successive interference costs 2, thus we reward the shield for short interferences.

**From Shields and Controllers to Policies.** Consider an abstraction MDP A. To ensure worst-case guarantees, we treat the controller as an adversary for the shield. Let SHIELD be a shield with memory set M and initial memory state m0. Intuitively, we find a policy in A that represents the interaction of SHIELD with a controller that maximizes the cost incurred. Formally, an *abstract controller* is a function <sup>χ</sup> : <sup>A</sup><sup>∗</sup> <sup>→</sup> <sup>Γ</sup>. The interaction between SHIELD and <sup>χ</sup> gives rise to a policy pol(SHIELD, χ) in <sup>A</sup>, which, recall, is a function from <sup>A</sup><sup>∗</sup> to <sup>Γ</sup>. We define pol(SHIELD, χ) inductively as follows. Consider a history π ∈ A<sup>∗</sup> that ends in a ∈ A, and suppose the current memory state of SHIELD is <sup>m</sup> <sup>∈</sup> <sup>M</sup>. Let <sup>γ</sup> = <sup>χ</sup>(π) and let γ , m = SHIELD(γ, a, m). Then, the action that the policy pol(SHIELD, χ) assigns is <sup>γ</sup> , and we update the memory state to be m .

**Problem Definition; Quantitative Shield Synthesis** Consider an abstraction MDP A, a behavioral score BEH, an interference score INT, both given as weighted automata, and a factor <sup>λ</sup> <sup>∈</sup> [0, 1] with which we weigh the two scores. Our goal is to find an *optimal shield* w.r.t. these inputs as we define below. Consider a shield SHIELD with memory set M. Let X be the set of abstract controllers. For SHIELD and <sup>χ</sup> <sup>∈</sup> <sup>X</sup>, let <sup>D</sup>(SHIELD, χ) be the probability distribution over <sup>A</sup> <sup>×</sup> <sup>Γ</sup> <sup>×</sup> <sup>Γ</sup> that the policy pol(SHIELD, χ) gives rise to. The *value* of SHIELD, denoted val(SHIELD), is sup<sup>χ</sup>∈<sup>X</sup> <sup>E</sup><sup>r</sup>∼D(SHIELD,χ)[<sup>λ</sup> · INT(r) + (1 <sup>−</sup> <sup>λ</sup>) · BEH(r)]. An *optimal shield* is a shield whose value is infSHIELD val(SHIELD).

*Remark 1* **(Robustness and flexibility).** The problem definition we consider allows quantitative optimization of shields w.r.t. two dimensions of quantitative measures. Earlier works have considered shields but mainly with respect to Boolean measures in both dimensions. For example, in [30], shields for safety behavioral measures were constructed with a Boolean notion of interference, as well as a Boolean notion of shield correctness. In contrast we allow quantitative objectives in both dimensions which presents a much more general and robust framework. For example, the first measure of correctness can be quantitative and minimize the error rate, and the second measure can allow shields to correct but minimize the long-run average interference. Both of the above allows the shield to be flexible. Moreover, tuning the parameter λ allows flexible tradeoff between the two.

We allow a robust class of quantitative specifications using weighted automata, which have been already established as a robust specification framework. Any automata model can be used in the framework, not necessarily the ones we use here. For example, weighted automata that discount the future or process only finite-words are suitable for planning purposes [32]. Thus our framework is a very robust and flexible framework for quantitative shield synthesis.

#### **2.3 Examples**

In Remark 1 we already discussed the flexibility of the framework. We now present concrete examples of instantiations of the optimization problem above on our running example, which illustrate how quantitative shields can be used to cope with limitations of learned controllers.

**Dealing with Unexpected Plant Behavior; Rush-Hour Traffic.** Consider the abstraction described in Example 1, where each abstract state is a 4-dimensional vector that represents the number of waiting cars in each direction. The behavioral score we use is called the *max queue*. It charges an abstract state <sup>a</sup> ∈ {0,...,k}<sup>4</sup> with the size of the maximal queue, no matter what the issued action is, thus costBEH(a) = max<sup>i</sup>∈{1,2,3,4} <sup>a</sup>i. A shield that minimizes the max-queue cost will prioritize the direction with the largest queue. For the interference score, we use a score that we call the *basic* interference score; we charge the shield 1 whenever it changes the controller's action and otherwise we charge it 0, and take the long-run average of the costs. Recall that in the construction in Example 1, we chose uniformly at random the vector of incoming cars. Here, in order to model rush-hour traffic, we use a different distribution, where we let p<sup>j</sup> be the probability that a car enters the j-th queue. Then, the probability of a vector <sup>i</sup>1, i2, i3, i4∈{0, 1}<sup>4</sup> is <sup>1</sup>≤j≤<sup>4</sup>(p<sup>j</sup> · <sup>i</sup><sup>j</sup> + (1 <sup>−</sup> <sup>p</sup><sup>j</sup> ) · (1 <sup>−</sup> <sup>i</sup><sup>j</sup> )). To model a higher load traveling on the North-South route, we increase <sup>p</sup><sup>1</sup> and <sup>p</sup><sup>3</sup> beyond <sup>0</sup>.5.

**Weighing Different Goals; Local Fairness.** Suppose the controller is trained to maximize the number of cars passing a *city*. Thus, it aims to maximize the speed of the cars in the city and prioritizes highways over farm roads. A secondary objective for a controller is to minimize local queues. Rather than adding this objective in the training phase, which can have an un-expected outcome, we can add a local shield for each junction. To synthesize the shield, we use the same abstraction and basic interference score as in the above. The behavioral score we use charges an abstract state <sup>a</sup> ∈ {0,...,k}<sup>4</sup> with difference <sup>|</sup>(a<sup>1</sup> <sup>+</sup> <sup>a</sup><sup>3</sup>) <sup>−</sup> (a<sup>2</sup> <sup>+</sup> <sup>a</sup><sup>4</sup>)|, thus the greater the inequality between the two waiting directions, the higher the cost.

**Adding Features to the Controller; Prioritizing Public Transport.** Suppose a controller is trained to increase throughput in a junction. After the controller is trained, a designer wants to add a functionality to the controller that prioritizes buses over personal vehicles. That is, if a bus is waiting in the North direction, and no bus is waiting in either the East or West directions, then the light should be North-South, and the other cases are similar. The abstraction we use is simpler than the ones above since we only differentiate between a case in which a bus is present or not, thus the abstract states are {0, 1}4, where the indices represent the directions clockwise starting from North. Let <sup>γ</sup> <sup>=</sup> NS. The behavioral cost of a state <sup>a</sup> is <sup>1</sup> when <sup>a</sup><sup>2</sup> <sup>=</sup> <sup>a</sup><sup>4</sup> = 0 and <sup>a</sup><sup>1</sup> = 1 or <sup>a</sup><sup>3</sup> = 1. The interference score we use is the basic one. A shield guarantees that in the long run, the specification is essentially never violated.

#### **3 A Game-Theoretic Approach to Quantitative Shield Synthesis**

In order to synthesize optimal shields we construct a two-player stochastic game [10], where we associate Player 2 with the shield and Player 1 with the controller. The game is defined on top of an abstraction and the players' objectives are given by the two performance measures. We first formally define stochastic games, then we construct the shield synthesis game, and finally show how to extract a shield from a strategy for Player 2.

**Stochastic Graph Games.** The game is played on a graph by placing a token on a vertex and letting the players move it throughout the graph. For ease of presentation, we fix the order in which the players move: first, Player 1, then Player 2, and then "Nature", i.e., the next vertex is chosen randomly. Edges have costs, which, again for convenience, appear only on edges following Player 2 moves. Formally, a two-player stochastic graph-game is -<sup>V</sup>1, V2, V<sup>N</sup> ,E,Pr, cost, where <sup>V</sup> <sup>=</sup> <sup>V</sup><sup>1</sup> <sup>∪</sup> <sup>V</sup><sup>2</sup> <sup>∪</sup> <sup>V</sup><sup>N</sup> is a finite set of vertices that is partitioned into three sets, for <sup>i</sup> ∈ {1, 2}, Player <sup>i</sup> controls the vertices in <sup>V</sup><sup>i</sup> and "Nature" controls the vertices in <sup>V</sup><sup>N</sup> , <sup>E</sup> <sup>⊆</sup> (V<sup>1</sup> <sup>×</sup> <sup>V</sup><sup>2</sup>) <sup>∪</sup> (V<sup>2</sup> <sup>×</sup> <sup>V</sup><sup>N</sup> ) is a set of deterministic edges, Pr : <sup>V</sup><sup>N</sup> <sup>×</sup> <sup>V</sup><sup>1</sup> <sup>→</sup> [0, 1] is a probabilistic transition function, and cost : (V<sup>2</sup> <sup>×</sup> <sup>V</sup><sup>N</sup> ) <sup>→</sup> <sup>Q</sup>. Suppose the token reaches <sup>v</sup> <sup>∈</sup> <sup>V</sup> . If <sup>v</sup> <sup>∈</sup> <sup>V</sup>i, for <sup>i</sup> ∈ {1, 2}, then Player <sup>i</sup> chooses the next position of the token <sup>u</sup> <sup>∈</sup> <sup>V</sup> , such that <sup>E</sup>(v, u). If <sup>v</sup> <sup>∈</sup> <sup>V</sup><sup>N</sup> , then the next position is chosen randomly; namely, the token moves to <sup>u</sup> <sup>∈</sup> <sup>V</sup> with probability Pr[v, u].

The game is a *zero-sum game*; Player 1 tries to maximize the expected long-run average of the accumulated costs, and Player 2 tries to minimize it. A *strategy* for Player <sup>i</sup>, for <sup>i</sup> ∈ {1, <sup>2</sup>}, is a function that takes a history in <sup>V</sup> <sup>∗</sup> · <sup>V</sup><sup>i</sup> and returns the next vertex to move the token to. The games we consider admit *memoryless* optimal strategies, thus it suffices to define a Player i strategy as a function from V<sup>i</sup> to V . We associate a *payoff* with two strategies f<sup>1</sup> and f2, which we define next. Given f<sup>1</sup> and f2, it is not hard to construct a Markov chain M with states V<sup>N</sup> and with weights on the edges: for v, u ∈ V<sup>N</sup> , the probability of moving from v to u in M is Pr<sup>M</sup>[v, u] = - <sup>w</sup>∈V1:f2(f1(w))=<sup>u</sup> Pr[v, w] and the cost of the edge is cost<sup>M</sup>(v, u) = - <sup>w</sup>∈V1:f2(f1(w))=<sup>u</sup> Pr[v, w] · cost(f<sup>1</sup>(w), u). The *stationary distribution* <sup>s</sup><sup>v</sup> of a vertex v ∈ V<sup>N</sup> in M is a well known concept [43] and it intuitively measures the long-run average time that is spend in - <sup>v</sup>. The payoff w.r.t. <sup>f</sup><sup>1</sup> and <sup>f</sup>2, denoted payoff(f1, f<sup>2</sup>) is v,u∈V*<sup>N</sup>* <sup>s</sup><sup>v</sup> · Pr<sup>M</sup>[v, u] · cost<sup>M</sup>(v, u). The payoff of a strategy is the payoff it guarantees against any strategy of the other player, thus payoff(f<sup>1</sup>) = inf<sup>f</sup><sup>2</sup> payoff(f1, f<sup>2</sup>). A strategy is *optimal* for Player 1 if it achieves the optimal payoff, thus <sup>f</sup> is optimal if payoff(f) = sup<sup>f</sup><sup>1</sup> payoff(f<sup>1</sup>). The definitions for Player <sup>2</sup> are dual.

**Constructing the Synthesis Game.** Consider an abstraction MDP <sup>A</sup> = -Γ, A, a0, δ, weighted automata for the behavioral score BEH = -<sup>A</sup>×Γ, QBEH, qBEH <sup>0</sup> , ΔBEH, costBEH and interference score INT = -<sup>Γ</sup> <sup>×</sup>Γ, QINT, qINT <sup>0</sup> , ΔINT, costINT, and a factor <sup>λ</sup> <sup>∈</sup> [0, 1]. We associate Player 1 with the controller and Player 2 with the shield. In each step, the controller first chooses an action, then the shield chooses whether to alter it, and the next state is selected at random. Let <sup>S</sup> <sup>=</sup> <sup>A</sup> <sup>×</sup> <sup>Q</sup>INT <sup>×</sup> <sup>Q</sup>BEH. We define <sup>G</sup>A,BEH,INT,λ <sup>=</sup> -<sup>V</sup>1, V2, V<sup>N</sup> ,E,Pr, cost, where


for <sup>s</sup> <sup>∈</sup> <sup>S</sup> and <sup>γ</sup> <sup>∈</sup> <sup>Γ</sup>, and <sup>E</sup>(s, γ,s , γ , N) for <sup>s</sup> = a, q1, q2 ∈ S, γ, γ ∈ Γ, and <sup>s</sup> <sup>=</sup> a, q 1, q <sup>2</sup> ∈ <sup>S</sup> s.t. <sup>Δ</sup>INT(q1,γ, γ , q <sup>1</sup>) and <sup>Δ</sup>BEH(q2,a, γ , q 2), a

s, γ,

– Pr[-a, q1, q2, γ,N,-, q1, q2] = <sup>δ</sup>(a, γ)(a ), and – for <sup>s</sup> = a, q1, q2 and <sup>s</sup> = a, q 1, q <sup>2</sup> as in the above, we have cost(-s , γ , N) = <sup>λ</sup> · costINT(q1,γ, γ , q <sup>1</sup>) + (1 <sup>−</sup> <sup>λ</sup>) · costBEH(q2,γ , a, q 2).

**From Strategies to Shields.** Recall that the game GA,BEH,INT,λ admits memoryless optimal strategies. Consider an optimal memoryless strategy <sup>f</sup> for Player 2. Thus, given a Player <sup>2</sup> vertex in <sup>V</sup>2, the function <sup>f</sup> returns a vertex in <sup>V</sup><sup>N</sup> to move to. The shield SHIELD<sup>f</sup> that is associated with <sup>f</sup> has the memory set <sup>M</sup> <sup>=</sup> <sup>Q</sup>INT <sup>×</sup> <sup>Q</sup>BEH and the initial memory state is qINT <sup>0</sup> , qBEH <sup>0</sup> . Given an abstract state a ∈ A, a memory state qINT, qBEH ∈ M, and a controller action γ ∈ Γ, let a, q INT, q BEH, γ = <sup>f</sup>(a, qINT, qBEH, γ). The shield SHIELD<sup>f</sup> returns the action <sup>γ</sup> and the updated memory state q INT, q BEH.

**Theorem 1.** *Given an abstraction* A*, weighted automata* BEH *and* INT*, and a factor* λ*, the game* GA,BEH,INT,λ *admits optimal memoryless strategies. Let* f *be an optimal memoryless strategy for Player* <sup>2</sup>*. The shield* SHIELD<sup>f</sup> *is an optimal shield w.r.t.* <sup>A</sup>*,* BEH*,* INT*, and* λ*.*

*Remark 2* **(Shield size).** Recall that a shield is a function SHIELD : <sup>A</sup> <sup>×</sup> <sup>Γ</sup> <sup>×</sup> <sup>M</sup> <sup>→</sup> Γ × M, which we store as a table. The *size* of the shield is the size of the domain, namely the number of entries in the table. Given an abstraction with n<sup>1</sup> states, a set of possible commands Γ, and weighted automata with n<sup>2</sup> and n<sup>3</sup> states, the size of the shield we construct is n<sup>1</sup> · n<sup>2</sup> · n<sup>3</sup> · |Γ|.

*Remark 3.* Our construction of the game can be seen as a two-step procedure: we construct a stochastic game with two mean-payoff objectives, a.k.a. a *two-dimensional* game, where the shield player's goal is to minimize both the behavioral and interference scores separately. We then reduce the game to a "one-dimension" game by weighing the scores with the parameter λ. We perform this reduction for several reasons. First, while multi-dimensional quantitative objectives have been studied in several cases, such as MDPs [4,6,7] and special problems of stochastic games (e.g., almostsure winning) [2,5,8], there is no general algorithmic solution known for stochastic games with two-dimensional objectives. Second, even for non-stochastic games with two-dimensional quantitative objectives, infinite-memory is required in general [48]. Finally, in our setting, the parameter λ provides a meaningful tradeoff: it can be associated with how well we value the quality of the controller. If the controller is of poor quality, then we charge the shield less for interference and set λ to be low. On the other hand, for a high-quality controller, we charge the shield more for interferences and set a high value for λ.

#### **4 Case Study**

We experiment with our framework in designing quantitative shields for traffic-light controllers that are trained using reinforcement-learning (RL). We illustrate the usefulness of shields in dealing with limitations of RL as well as providing an intuitive framework to complement RL techniques.

**Traffic Simulation.** All experiments were conducted using traffic simulator "Simulation of Urban MObility" (SUMO, for short) [31] v0.22 using the SUMO Python API. Incoming traffic in the cities is chosen randomly. The simulations were executed on a desktop computer with a 4 <sup>x</sup> 2.70 GHz Intel Core i7-7500U CPU, 7.7 GB of RAM running Ubuntu 16.04.

**The Traffic Light Controller.** We use RL to train a city-wide traffic-signal controller. Intuitively, the controller is aware of the waiting cars in each junction and its actions constitute a light assignment to all the junctions. We train a controller using a deep convolutional Q-network [37]. In most of the networks we test with, there are two controlled junctions. The input vector to the neural network is a 16-dimensional vector, where 8 dimensions represent a junction. For each junction, the first four components state the number of cars approaching the junction and the last four components state the accumulated waiting time of the cars in each one of the lanes. For example, in Fig. 1, the first four components are (3, 6, 3, 1), thus the controller's state is not trimmed at 5. The controller is trained to minimize both the number of cars waiting in the queues and the total waiting time. For each junction i, the controller can choose to set the light to be either NS<sup>i</sup> or EWi, thus the set of possible actions is <sup>Γ</sup> = {NS1NS2, EW1NS2, NS1EW2, EW1EW2}.

We use a network consisting of 4 layers: The input layer is a convolutional layer with 16 nodes, the first hidden and the second hidden layers consisting out of 604 nodes and 1166 nodes, respectively. The output layer consists of 4 neurons with linear activation functions, each representing one of the above mentioned actions listed in Γ. The Q-learning uses the learning rate <sup>α</sup> = 0.001 and the discount factor 0.95 for the Qupdate and an -greedy exploration policy. The artificial neural network is built on an open source implementation<sup>1</sup> using Keras [9] and additional optimized functionality was provided by the NumPy [40] library. We train for 100 training epochs, where each epoch is 1500 seconds of simulated traffic, plus 2000 additional seconds in which no new cars are introduced. The total training time of the agent is roughly 1.5 hours. While the RL procedure that we use is simple procedure, it is inspired by standard approaches

<sup>1</sup> https://github.com/Wert1996/Traffic-Optimisation.

to learning traffic controllers and produces controllers that perform relatively well also with no shield.

**The Shield.** We synthesize a "local" shield for a junction and copy the shield for each junction in the city. Recall that the first step in constructing the synthesis game is to construct an abstraction of the plant, which intuitively represents the information according to which the shield makes its decisions. The abstraction we use is described in Example 1; each state is a 4-dimensional integer in {0,...,k}, which represents an abstraction of the number of waiting cars in each direction, cut-off by <sup>k</sup> <sup>∈</sup> <sup>N</sup>. As elaborated in the example, when a shield assigns a green light to a direction, we evict a car from the two respectable queues, and select the incoming cars uniformly at random. Regarding objectives, in most of our experiments, the behavioral score we use charges an abstract state <sup>a</sup> ∈ {0,...,k}<sup>4</sup> with <sup>|</sup>(a<sup>1</sup> <sup>+</sup> <sup>a</sup><sup>3</sup>) <sup>−</sup> (a<sup>2</sup> <sup>+</sup> <sup>a</sup><sup>4</sup>)|, thus the shield aims to balance the total number of waiting cars per direction. The interference score we use charges the shield 1 for altering the controller's action.

Since we use simple automata for objectives, the size of the shields we use is|A×Γ|, where <sup>|</sup>Γ<sup>|</sup> = 2. In our experiments, we cut-off the queues at <sup>k</sup> = 6, which results in a shield of size 2592. The synthesis procedure's running time is in the order of minutes. We have already pointed out that we are interested in small light-weight shields, and this is indeed what we construct. In terms of absolute size, the shield takes <sup>∼</sup>60 KB versus the controller who takes <sup>∼</sup>3 MB; a difference of 2 orders of magnitude.

Our synthesis procedure includes a solution to a stochastic mean-payoff game. The complexity of solving such games is an interesting combinatorial problem in NP and coNP (thus unlikely to be NP-hard) for which the existence of a polynomial-time algorithm is major long-standing open problem. The current best-known algorithms are exponential, and even for special cases like turn-based deterministic mean-payoff games or turn-based stochastic games with reachability objectives, no polynomial-time algorithms are known. The algorithm we implemented is called the *strategy iteration* algorithm [22,23] in which one starts with a strategy and iteratively improves it, where each iteration requires polynomial time. While the algorithm's worst-case complexity is exponential, in practice, the algorithm has been widely observed to terminate in a few number of iterations.

**Evaluating Performance.** Throughout all our experiments, we use a unified and concrete measure of performance: the total waiting time of the cars in the city. Our assumption is that minimizing this measure is the main objective of the designer of the traffic light system for the city. While performance is part of the objective function when training the controller, the other components of the objective are used in order to improve training. Similarly, the behavioral measure we use when synthesizing shields is chosen heuristically in order to construct shields that improve concrete performance.

**The Effect of Changing** *<sup>λ</sup>*. Recall that we use <sup>λ</sup> <sup>∈</sup> [0, 1] in order to weigh between the behavioral and interference measures of a shield, where the larger λ is, the more the shield is charged for interference. In our first set of experiments, we fix all parameters apart from λ and synthesize shields for a city that has two controllable junctions. In the first experiment, we use a random traffic flow that is similar to the one used in training.

**Fig. 2.** Results for shields constructed with various λ values, together with a fixed plant and controller, where the simulation traffic distribution matches the one the controller is trained for.

We depict the results of the simulation in Fig. 2. We make several observations on the results below.

*Interference.* We observe that the ratio of the time that the shield intervenes is low: for most values of <sup>λ</sup> the ratio is well below 10%. For large values of <sup>λ</sup>, interference is too costly, and the shields become *trivial*, namely it never alters the actions of the controller. The performance we observe is thus the performance of the controller with no shield. In this set of experiments, we observe that the threshold after which shields become trivial is <sup>λ</sup> = 0.5, and for different setups, the threshold changes.

*Performance.* We observe that performance as function of λ, is a curve-like function. When λ is small, altering commands is cheap, the shield intervenes more frequently, and performance drops. This performance drop is expected: the shield is a simple device and the quality of its routing decisions cannot compete with the trained controller. This drop is also encouraging since it illustrates that our experimental setting is interesting. Surprisingly, we observe that the curve is in fact a paraboloid: for some values, e.g., <sup>λ</sup> = 0.4, the shield improves the performance of the controller. We find it unexpected that the shield improves performance even when observing trained behavior, and this performance increase is observed more significantly in the next experiments.

**Rush-Hour Traffic.** In Fig. 3, we use a shield to add robustness to a controller for behavior it was not trained for. We see a more significant performance gain in this exper-

**Fig. 3.** Similar to Fig. 2 only that the simulation traffic distribution models rush-hour traffic.

**Fig. 4.** Comparing the variability in performance of the different controllers, with shield (blue) and without a shield (red). (Color figure online)

iment. We use the controller from the previous experiment, which is trained for uniform car arrival. We simulate it in a network with "rush-hour" traffic, which we model by significantly increasing the traffic load in the North-South direction. We synthesize shields that prefer to evict traffic from the North-South queue over the East-West queue. We achieve this by altering the objective in the stochastic game; we charge the shield a greater penalty for cars waiting in these queues over the other queues. For most values of <sup>λ</sup> below 0.7, we see a performance gain. Note that the performance of the controller with no shield is depicted on the far right, where the shield is trivial. An alternative approach to synthesize a shield would be to alter the probabilities in the abstraction, but we found that altering the weights results in a better performance gain.

**Reducing Variability.** Machine learning techniques are intricate, require expertise, and a fine tuning of parameters. This set of experiments show how the use of shields reduces variability of the controllers, and as a result, it reduces the importance of choosing the optimal parameters in the training phase. We fix one of the shields from the first experiment with <sup>λ</sup> = 0.4. We observe performance in a city with various controllers, which are trained with varying training parameters, when the controllers are run with and without the shield and on various traffic conditions that sometimes differ from the ones they are trained on.

The city we experiment with consists of a main two-lane road that crosses the city from East to West. The main road has two junctions in which smaller "farm roads" meet the main road. We refer to the *bulk traffic* as the traffic that only "crosses the city"; namely, it flows only on the main road either from East to West or in the opposite direction. For <sup>r</sup> <sup>∈</sup> [0, 1], Controller-<sup>r</sup> is trained where the ratio of the bulk traffic out of the total traffic is r. That is, the higher r is, the less traffic travels on the farm roads. We run simulations in which Controller-<sup>r</sup> observes bulk traffic <sup>k</sup> <sup>∈</sup> [0, 1], which it was not necessarily trained for.

**Fig. 5.** Results for Controllers-0.65 and 0.9 exhibiting traffic that they are not trained for, with and without a shield. Performance is the total waiting time of the cars in the city.

In Fig. 5, we depict the performance of two controllers for various traffic settings. We observe, in these two controllers as well as the others, that operating with a shield consistently improves performance. The plots illustrate the unexpected behavior of machine-learning techniques: e.g., when run without a shield, Controller-0.9 outperforms Controller-0.65 in all settings, even in the setting 0.65 on which Controller-0.65 was trained on. Thus, a designer who expects a traffic flow of 0.65, would be better off training with a traffic of 0.9. A shield improves performance and thus reduces the importance of which training data to use.

*Measuring Variability.* In Fig. 4, we depict the variability in performance between the controllers. The higher the variability is, the more significant it is to choose the right parameters when training the controller. Formally, let <sup>R</sup> = {0.65, 0.7, 0.75, 0.8, 0.85, 0.9}. For r, k <sup>∈</sup> <sup>R</sup>, we let Perf(r, k) denote the performance (total waiting times) when Controller-r observes bulk traffic k. For each k ∈ R, we depict maxr∈<sup>R</sup> Perf(r, k) <sup>−</sup> minr-<sup>∈</sup><sup>R</sup> Perf(r , k), when operating with and without a shield.

Clearly, the variability with a shield is significantly lower than without one. This data shows that when operating with a shield, it does not make much difference if a designer trains a controller with setting r or r . When operating without a shield, the difference is significant.

**Overcoming Liveness Bugs.** Finding bugs in learned controllers is a challenging task. Shields bypass the need to find bugs since they treat the controller as a black-box and correct its behavior. We illustrate their usefulness in dealing with liveness bugs. In the same network as in the previous setting, we experiment with a controller whose training process lacked variability. In Fig. 6, we depict the light configuration throughout the experiment on the main road; the horizontal axis represents time, red means a red light for the main road and dually green. Initially, the controller performs well, but roughly half-way through the simulation it hits a bad state after which the light stays red. The shield, with only a few interferences, which are represented with dots, manages to recover the controller from its stuck state. In Fig. 7, we depict the number of waiting cars in the city, which clearly skyrockets once the controller gets stuck. It is evident that initially, the controller performs well. This point highlights that it is difficult to recog-

**Fig. 6.** The light in the East-West direction (the main road) of a junction. On bottom, with no shield the controller is stuck. On top, the shield's interferences are marked with dots.

**Fig. 7.** The total number of waiting cars (log-scale) with and without a shield. Initially, the controller performs well on its own, until it gets stuck and traffic in the city freezes.

nize when a controller has a bug – in order to catch such a bug, a designer would need to find the right simulation and run it half way through before the bug appears.

One way to regain liveness would be to synthesize a shield for the qualitative property "each direction eventually gets a green light". Instead, we use a shield that is synthesized for the quantitative specification as in the previous experiment. The shield, with a total of only 20 alterations is able to recover the controller from the bad state it is stuck in, and traffic flows correctly.

**Adding Functionality; Prioritizing Public Transport.** Learned controllers are monolithic. Adding functionality to a controller requires a complete re-training, which is time consuming, computationally costly, and requires care; changes in the objective can cause unexpected side effects to the performance. We illustrate how, using a shield, we can add to an existing controller, the functionality of prioritizing public transport.

The abstraction over which the shield is constructed slightly differs from the one used in the other experiments. The abstract state space is the same, namely fourdimensional vectors, though we interpret the entries as the positions of a bus in the respective queue. For example, the state (0, 3, 0, 1) represents no bus in the North queue and a bus which is waiting, third in line, in the East queue. Outgoing edges from an abstract state also differ as they take into account, using probability, that vehicles might enter the queues between buses. For the behavioral score, we charge an abstract state with the sum of its entries, thus the shield is charged whenever buses are waiting and it aims to evict them from the queues as soon as possible.

In Fig. 8, we depict the performance of all vehicles and only buses as a function of the weighing factor λ. The result of this experiment is positive; the predicted behavior is observed. Indeed, when λ is small, interferences are cheap, which increase bus performance at the expense of the general performance. The experiment illustrates that the parameter λ is a convenient method to control the degree of prioritization of buses.

**Local Fairness.** In this experiment, we add local fairness to a controller that was trained for a global objective. We experiment with a network with four junctions and a city-wide controller, which aims to minimize total waiting times. Figure 9 shows that when the controller is deployed on its own, queues form in the city whereas a shield, which was synthesized as in the first experiments, prevents such local queues from forming.

**Fig. 8.** The waiting time of buses/all vehicles with shields parameterized by λ.

**Fig. 9.** Comparing the amount of waiting cars with and without a shield.

#### **5 Discussion and Future Work**

We suggest a framework for automatically synthesizing quantitative runtime shields to cope with limitations of machine-learning techniques. We show how shields can increase robustness to untrained behavior, deal with liveness bugs without verification, add features without retraining, and decrease variability of performance due to changes in the training parameters, which is especially helpful for machine learning non-experts. We use weighted automata to evaluate controller and shield behavior and construct a game whose solution is an optimal shield w.r.t. a weighted specification and a plant abstraction. The framework is robust and can be applied in any setting where learned or other black-box controllers are used.

We list several directions for further research. In this work, we make no assumptions on the controller and treat it adversarially. Since the controller might have bugs, modelling it as adversarial is reasonable. Though, it is also a crude abstraction since typically, the objectives of the controller and shield are similar. For future work, we plan to study ways to model the spectrum between cooperative and adversarial controllers together with solution concepts for the games that they give rise to.

In this work we make no assumptions on the relationship between the plant and the abstraction. While the constructed shields are optimal w.r.t. the given abstraction, the scores they guarantee w.r.t. the abstraction do not imply performance guarantees on the plant. To be able to produce performance guarantees on the concrete plant, we need guarantees on the relationship between the plant its abstraction. For future work, we plan to study the addition of such guarantees and how they affect the quality measures.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Taming Delays in Dynamical Systems Unbounded Verification of Delay Differential Equations**

Shenghua Feng1,2 , Mingshuai Chen1,2(B) , Naijun Zhan1,2(B) , Martin Franzle ¨ <sup>3</sup> , and Bai Xue1,2

> <sup>1</sup> SKLCS, Institute of Software, CAS, Beijing, China {fengsh,chenms,znj,xuebai}@ios.ac.cn <sup>2</sup> University of Chinese Academy of Sciences, Beijing, China <sup>3</sup> Carl von Ossietzky Universitat Oldenburg, ¨ Oldenburg, Germany fraenzle@informatik.uni-oldenburg.de

**Abstract.** Delayed coupling between state variables occurs regularly in technical dynamical systems, especially embedded control. As it consequently is omnipresent in safety-critical domains, there is an increasing interest in the safety verification of systems modelled by Delay Differential Equations (DDEs). In this paper, we leverage qualitative guarantees for the existence of an exponentially decreasing estimation on the solutions to DDEs as established in classical stability theory, and present a quantitative method for constructing such delaydependent estimations, thereby facilitating a reduction of the verification problem over an unbounded temporal horizon to a bounded one. Our technique builds on the linearization technique of nonlinear dynamics and spectral analysis of the linearized counterparts. We show experimentally on a set of representative benchmarks from the literature that our technique indeed extends the scope of bounded verification techniques to unbounded verification tasks. Moreover, our technique is easy to implement and can be combined with any automatic tool dedicated to bounded verification of DDEs.

**Keywords:** Unbounded verification · Delay Differential Equations (DDEs) · Safety and stability · Linearization · Spectral analysis

#### **1 Introduction**

The theory of dynamical systems featuring delayed coupling between state variables dates back to the 1920s, when Volterra [41,42], in his research on predator-prey models and viscoelasticity, formulated some rather general differential equations incorporating the past states of the system. This formulation, now known as delay differential equations (DDEs), was developed further by, e.g., Mishkis [30] and Bellman

This work has been supported through grants by NSFC under grant No. 61625206, 61732001 and 61872341, by Deutsche Forschungsgemeinschaft through grants No. GRK 1765 and FR 2715/4, and by the CAS Pioneer Hundred Talents Program under grant No. Y8YC235015.

and Cooke [2], and has witnessed numerous applications in many domains. Prominent examples include population dynamics [25], where birth rate follows changes in population size with a delay related to reproductive age; spreading of infectious diseases [5], where delay is induced by the incubation period; or networked control systems [21] with their associated transport delays when forwarding data through the communication network. These applications range further to models in optics [23], economics [38], and ecology [13], to name just a few. Albeit resulting in more accurate models, the presence of time delays in feedback dynamics often induces considerable extra complexity when one attempts to design or even verify such dynamical systems. This stems from the fact that the presence of feedback delays reduces controllability due to the impossibility of immediate reaction and enhances the likelihood of transient overshoot or even oscillation in the feedback system, thus violating safety or stability certificates obtained on idealized, delay-free models of systems prone to delayed coupling.

Though established automated methods addressing ordinary differential equations (ODEs) and their derived models, like hybrid automata, have been extensively studied in the verification literature, techniques pertaining to ODEs do not generalize straightforwardly to delayed dynamical systems described by DDEs. The reason is that the future evolution of a DDE is no longer governed by the current state instant only, but depends on a chunk of its historical trajectory, such that introducing even a single constant delay immediately renders a system with finite-dimensional states into an infinite-dimensional dynamical system. There are approximation methods, say the Pade approximation [ ´ 39], that approximate DDEs with finite-dimensional models, which however may hide fundamental behaviors, e.g. (in-)stability, of the original delayed dynamics, as remarked in Sect. 5.2.2.8.1 of [26]. Consequently, despite well-developed numerical methods for solving DDEs as well as methods for stability analysis in the realm of control theory, hitherto in automatic verification, only a few approaches address the effects of delays due to the immediate impact of delays on the structure of the state spaces to be traversed by state-exploratory methods.

In this paper, we present a constructive approach dedicated to verifying safety properties of delayed dynamical systems encoded by DDEs, where the safety properties pertain to an infinite time domain. This problem is of particular interests when one pursues correctness guarantees concerning dynamics of safety-critical systems over a long run. Our approach builds on the *linearization* technique of potentially nonlinear dynamics and *spectral analysis* of the linearized counterparts. We leverage qualitative guarantees for the existence of an exponentially decreasing estimation on the solutions to DDEs as established in classical stability theory (see, e.g., [2,19,24]), and present a quantitative method to construct such estimations, thereby reducing the temporally unbounded verification problems to their bounded counterparts.

The class of systems we consider features delayed differential dynamics governed by DDEs of the form **<sup>x</sup>**˙ (t) = *f* (**<sup>x</sup>** (t), **<sup>x</sup>** (t <sup>−</sup> r<sup>1</sup>),..., **<sup>x</sup>** (<sup>t</sup> <sup>−</sup> <sup>r</sup><sup>k</sup>)) with initial states specified by a continuous function *<sup>φ</sup>* (t) on [−rmax, 0] where <sup>r</sup>max = max{r<sup>1</sup>,...,r<sup>k</sup>}. It thus involves a combination of ODE and DDE with multiple constant delays <sup>r</sup><sup>i</sup> <sup>&</sup>gt; <sup>0</sup>, and has been successfully used to model various real-world systems in the aforementioned fields. In general, formal verification of unbounded safety or, dually, reachability properties of such systems inherits undecidability from similar properties for ODEs (cf. e.g., [14]). We therefore tackle this unbounded verification problem by leveraging a stability criterion of the system under investigation.

*Contributions.* In this paper, we present a quantitative method for constructing a delaydependent, exponentially decreasing upper bound, if existent, that encloses trajectories of a DDE originating from a certain set of initial functions. This method consequently yields a temporal bound T <sup>∗</sup> such that for any T >T <sup>∗</sup>, the system is safe over [−rmax, T] iff it is safe over [−rmax,∞). For linear dynamics, such an equivalence of safety applies to any initial set of functions drawn from a compact subspace in R<sup>n</sup>; while for nonlinear dynamics, our approach produces (a subset of) the *basin of attraction* around a *steady state*, and therefore a certificate (by bounded verification in finitely many steps) that guarantees the reachable set being contained in this basin suffices to claim safety/unsafety of the system over an infinite time horizon. Our technique is easy to implement and can be combined with any automatic tool for bounded verification of DDEs. We show experimentally on a set of representative benchmarks from the literature that our technique effectively extends the scope of bounded verification techniques to unbounded verification tasks.

*Related Work.* As surveyed in [14], the research community has over the past three decades vividly addressed automatic verification of hybrid discrete-continuous systems in a safety-critical context. The almost universal undecidability of the unbounded reachability problem, however, confines the sound key-press routines to either semi-decision procedures or even approximation schemes, most of which address bounded verification by computing the finite-time image of a set of initial states. It should be obvious that the functional rather than state-based nature of the initial condition of DDEs prevents a straightforward generalization of this approach.

Prompted by actual engineering problems, the interest in safety verification of continuous or hybrid systems featuring delayed coupling is increasing recently. We classify these contributions into two tracks. The first track pursues propagation-based bounded verification: Huang et al. presented in [21] a technique for simulation-based timebounded invariant verification of nonlinear networked dynamical systems with delayed interconnections, by computing bounds on the sensitivity of trajectories to changes in initial states and inputs of the system. A method adopting the paradigm of verificationby-simulation (see, e.g., [9,16,31]) was proposed in [4], which integrates rigorous error analysis of the numeric solving and the sensitivity-related state bloating algorithms (cf. [7]) to obtain safe enclosures of time-bounded reachable sets for systems modelled by DDEs. In [46], the authors identified a class of DDEs featuring a local homeomorphism property which facilitates construction of over- and under-approximations of reachable sets by performing reachability analysis on the boundaries of the initial sets. Goubault et al. presented in [17] a scheme to compute inner- and outer-approximating flowpipes for DDEs with uncertain initial states and parameters using Taylor models combined with space abstraction in the shape of zonotopes. The other track of the literature tackles unbounded reachability problem of DDEs by taking into account the asymptotic behavior of the dynamics under investigation, captured by, e.g., Lyapunov functions in [32,47] and barrier certificates in [35]. These approaches however share a common limitation that a polynomial template has to be specified either for the interval Taylor models exploited in [47] (and its extension [29] to cater for properties specified as bounded metric interval temporal logic (MITL) formulae), for Lyapunov functionals in [32], or for barrier certificates in [35]. Our approach drops this limitation by resorting to the linearization technique followed by spectral analysis of the linearized counterparts, and furthermore extends over [47] by allowing immediate feedback (i.e. **<sup>x</sup>**(t)) as well as multiple delays in the dynamics), to which their technique does not generalize immediately. In contrast to the *absolute stability* exploited in [32], namely a criterion that ensures stability for arbitrarily large delays, we give the construction of a delaydependent stability certificate thereby substantially increasing the scope of dynamics amenable to stability criteria, for instance, the famous Wright's equation (cf. [44]). Finally, we refer the readers to [34] and [33] for related contributions in showing the existence of abstract symbolic models for nonlinear control systems with time-varying and unknown time-delay signals via approximate bisimulations.

#### **2 Problem Formulation**

**Notations.** Let N, R and C be the set of natural, real and complex numbers, respectively. Vectors will be denoted by boldface letters. For z = a + ib <sup>∈</sup> <sup>C</sup> with a, b <sup>∈</sup> <sup>R</sup>, the real and imaginary parts of z are denoted respectively by <sup>R</sup>(z) = a and <sup>I</sup>(z) = b; <sup>|</sup>z<sup>|</sup> <sup>=</sup> <sup>√</sup>a<sup>2</sup> <sup>+</sup> <sup>b</sup><sup>2</sup> is the modulus of <sup>z</sup>. For a vector **<sup>x</sup>** <sup>∈</sup> <sup>R</sup><sup>n</sup>, <sup>x</sup><sup>i</sup> refers to its <sup>i</sup>-th component, and its maximum norm is denoted by **x** = max<sup>1</sup>≤i≤<sup>n</sup> <sup>|</sup>x<sup>i</sup>|. We define for δ > 0, <sup>B</sup>(**x**, δ) = {**x** <sup>∈</sup> <sup>R</sup><sup>n</sup> | **x** <sup>−</sup> **<sup>x</sup>** ≤ <sup>δ</sup>} as the <sup>δ</sup>-closed ball around **<sup>x</sup>**. The notation · extends to a set <sup>X</sup> <sup>⊆</sup> <sup>R</sup><sup>n</sup> as X = sup**<sup>x</sup>**∈<sup>X</sup> **x**, and to an <sup>m</sup> <sup>×</sup> <sup>n</sup> complex-valued matrix <sup>A</sup> as A = max<sup>1</sup>≤i≤<sup>m</sup> n <sup>j</sup>=1 <sup>|</sup>aij <sup>|</sup>. <sup>X</sup> is the closure of <sup>X</sup> and ∂X denotes the boundary of <sup>X</sup>. For <sup>a</sup> <sup>≤</sup> <sup>b</sup>, let <sup>C</sup><sup>0</sup>([a, b], <sup>R</sup><sup>n</sup>) denote the space of continuous functions from [a, b] to <sup>R</sup><sup>n</sup>, which is associated with the maximum norm f = max<sup>t</sup>∈[a,b] f(t). We abbreviate <sup>C</sup><sup>0</sup>([−r, 0], <sup>R</sup><sup>n</sup>) as <sup>C</sup><sup>r</sup> for a fixed positive constant r, and let C<sup>1</sup> consist of all continuously differentiable functions. Given f : [0,∞) → <sup>R</sup> a measurable function such that f(t) ≤ aebt for some constants <sup>a</sup> and b, then the Laplace transform L{f} defined by L{f}(z) = <sup>∞</sup> <sup>0</sup> <sup>e</sup><sup>−</sup>ztf(t) d<sup>t</sup> exists and is an analytic function of z for <sup>R</sup>(z) > b.

**Delayed Differential Dynamics.** We consider a class of dynamical systems featuring delayed differential dynamics governed by DDEs of autonomous type:

$$\begin{cases} \dot{\mathbf{x}}(t) = \mathbf{f}\left(\mathbf{x}(t), \mathbf{x}\left(t - r\_1\right), \dots, \mathbf{x}\left(t - r\_k\right)\right), \quad t \in [0, \infty) \\ \mathbf{x}(t) = \boldsymbol{\phi}(t), \quad t \in [-r\_k, 0] \end{cases} \tag{1}$$

where **<sup>x</sup>** is the time-dependent *state* vector in <sup>R</sup><sup>n</sup>, **<sup>x</sup>**˙ denotes its temporal derivative d**x**/dt, and t is a real variable modelling time. The discrete delays are assumed to be ordered as <sup>r</sup><sup>k</sup> > ... > r<sup>1</sup> <sup>&</sup>gt; <sup>0</sup>, and the initial states are specified by a vector-valued function *<sup>φ</sup>* ∈ C<sup>r</sup><sup>k</sup> .

Suppose *f* is a Lipschitz-continuous vector-valued function in C<sup>1</sup> R(k+1)<sup>n</sup>, <sup>R</sup><sup>n</sup> , which implies that the system has a unique maximal *solution* (or *trajectory*) from a given initial condition *<sup>φ</sup>* ∈ C<sup>r</sup><sup>k</sup> , denoted as *ξφ* : [−r<sup>k</sup>,∞) → <sup>R</sup><sup>n</sup>. We denote in the sequel by *<sup>f</sup>***<sup>x</sup>** <sup>=</sup> <sup>∂</sup>*f* ∂x<sup>1</sup> ··· <sup>∂</sup>*<sup>f</sup>* ∂x<sup>n</sup> the Jacobian matrix (i.e., matrix consisting of all firstorder partial derivatives) of *f* w.r.t. the component **<sup>x</sup>** (t). Similar notations apply to components **<sup>x</sup>** (t <sup>−</sup> ri), for <sup>i</sup> = 1,...,k.

*Example 1 (Gene regulation* [12,36]*).* The control of gene expression in cells is often modelled with time delays in equations of the form

$$\begin{cases}
\dot{x}\_1(t) = g\left(x\_n(t - r\_n)\right) - \beta\_1 x\_1(t) \\
\dot{x}\_j(t) = x\_{j-1}(t - r\_{j-1}) - \beta\_j x\_j(t), \quad 1 < j \le n
\end{cases} \tag{2}$$

where the gene is transcribed producing mRNA (x<sup>1</sup>), which is translated into enzyme <sup>x</sup><sup>2</sup> that in turn produces another enzyme <sup>x</sup><sup>3</sup> and so on. The end product <sup>x</sup><sup>n</sup> acts to repress the transcription of the gene by g <˙ 0. Time delays are introduced to account for time involved in transcription, translation, and transport. The positive <sup>β</sup><sup>j</sup> 's represent decay rates of the species. The dynamic described in Eq. (2) falls exactly into the scope of systems considered in this paper, and in fact, it instantiates a more general family of systems known as monotone cyclic feedback systems (MCFS) [28], which includes neural networks, testosterone control, and many other effects in systems biology.

**Lyapunov Stability.** Given a system of DDEs in Eq. (1), suppose *f* has a steady state (a.k.a., *equilibrium*) at **<sup>x</sup>**<sup>e</sup> such that *<sup>f</sup>*(**x**<sup>e</sup>,..., **<sup>x</sup>**<sup>e</sup>) = **<sup>0</sup>** then


Here **x**<sup>e</sup> can be generalized to a constant function in C<sup>r</sup><sup>k</sup> when employing the supremum norm *φ* <sup>−</sup> **<sup>x</sup>**<sup>e</sup> over functions. This norm further yields the *locality* of the above definitions, i.e., they describe the behavior of a system near an equilibrium, rather than of all initial conditions *<sup>φ</sup>* ∈ C<sup>r</sup><sup>k</sup> , in which case it is termed the *global stability*. W.l.o.g., we assume *f*(**0**,..., **<sup>0</sup>**) = **<sup>0</sup>** in the sequel and investigate the stability of the zero equilibrium thereof. Any nonzero equilibrium can be straightforwardly shifted to a zero one by coordinate transformation while preserving the stability properties, see e.g., [19].

**Safety Verification Problem.** Given X ⊆ <sup>R</sup><sup>n</sup> a compact set of initial states and U ⊆ <sup>R</sup><sup>n</sup> a set of unsafe or otherwise bad states, a delayed dynamical system of the form (1) is said to be T-*safe* iff all trajectories originating from any *φ*(t) satisfying *φ*(t) ∈ X , <sup>∀</sup>t <sup>∈</sup> [−r<sup>k</sup>, 0] do not intersect with <sup>U</sup> at any <sup>t</sup> <sup>∈</sup> [−r<sup>k</sup>, T], and <sup>T</sup>-*unsafe* otherwise. In particular, we distinguish *unbounded verification* with T = <sup>∞</sup> from *bounded verification* with T < <sup>∞</sup>.

In subsequent sections, we first present our approach to tackling the safety verification problem of delayed differential dynamics coupled with one single constant delay (i.e., k = 1 in Eq. (1)) in an unbounded time domain, by leveraging a quantitative stability criterion, if existent, for the linearized counterpart of the potentially nonlinear dynamics in question. A natural extension of this approach to cater for dynamics with multiple delay terms will be remarked thereafter. In what follows, we start the elaboration of the method from DDEs of linear dynamics that admit spectral analysis, and move to nonlinear cases afterwards and show how the linearization technique can be exploited therein.

#### **3 Linear Dynamics**

Consider the linear sub-class of dynamics given in Eq. (1):

$$\begin{cases} \dot{\mathbf{x}}(t) = A\mathbf{x}\left(t\right) + B\mathbf{x}\left(t - r\right), \quad t \in [0, \infty) \\ \mathbf{x}(t) = \phi\left(t\right), \quad t \in [-r, 0] \end{cases} \tag{3}$$

where A, B <sup>∈</sup> <sup>R</sup><sup>n</sup>×<sup>n</sup>, *φ* ∈ C<sup>r</sup>, and the system is associated with the *characteristic equation*

$$\det\begin{pmatrix} zI - A - B e^{-rz} \end{pmatrix} = 0,\tag{4}$$

$$\begin{pmatrix} \dots & \dots & \dots & \dots & \dots & \dots & \dots\\ & \dots & \dots & \dots & \dots & \dots \end{pmatrix} \tag{5}$$

where <sup>I</sup> is the <sup>n</sup>×<sup>n</sup> identity matrix. Denote by <sup>h</sup>(z) <sup>=</sup> zI−A−Be<sup>−</sup>rz the *characteristic matrix* in the sequel. Notice that the characteristic equation can be obtained by seeking nontrivial solutions to Eq. (3) of the form *ξφ* (t) = **<sup>c</sup>**ezt, where **<sup>c</sup>** is an <sup>n</sup>-dimensional nonzero constant vector.

The roots λ <sup>∈</sup> <sup>C</sup> of Eq. (4) are called *characteristic roots* or *eigenvalues* and the set of all eigenvalues is referred to as the *spectrum*, denoted by σ = {λ <sup>|</sup> det (h(λ)) = 0}. Due to the exponentiation in the characteristic equation, the DDE has, in line with its infinite-dimensional nature, infinitely many eigenvalues possibly, making a spectral analysis more involved. The spectrum does however enjoy some elementary properties that can be exploited in the analysis. For instance, the spectrum has no finite accumulation point in <sup>C</sup> and therefore for each positive γ <sup>∈</sup> <sup>R</sup>, the number of roots satisfying <sup>|</sup>λ| ≤ γ is finite. It follows that the spectrum is a countable (albeit possibly infinite) set:

**Lemma 1 (Accumulation freedom** [6,19]**).** *Given* γ <sup>∈</sup> <sup>R</sup>*, there are at most finitely many characteristic roots satisfying* <sup>R</sup>(λ) > γ*. If there is a sequence* {λ<sup>n</sup>} *of roots of Eq.* (4) *such that* <sup>|</sup>λ<sup>n</sup>|→∞ *as* n → ∞*, then* <sup>R</sup>(λ<sup>n</sup>) → −∞ *as* n → ∞*.*

Lemma 1 suggests that there are only a finite number of solutions in any vertical strip in the complex plane, and there thus exists an upper bound α <sup>∈</sup> <sup>R</sup> such that every characteristic root λ in the spectrum satisfies <sup>R</sup>(λ) < α. This upper bound captures essentially the asymptotic behavior of the linear dynamics:

**Theorem 1 (Globally exponential stability** [6,36]**).** *Suppose* <sup>R</sup>(λ) < α *for every characteristic root* λ*. Then there exists* K > 0 *such that*

$$\|\|\xi\_{\phi}(t)\|\| \le K \left\|\|\phi\|\right\|\mathrm{e}^{\alpha t}, \quad \forall t \ge 0, \,\forall \phi \in \mathcal{C}\_r,\tag{5}$$

*where ξφ* (t) *is the solution to Eq.* (3)*. In particular,* **<sup>x</sup>** <sup>=</sup> **<sup>0</sup>** *is a globally exponentially stable equilibrium of Eq.* (3) *if* <sup>R</sup>(λ) < 0 *for every characteristic root; it is unstable if there is a root satisfying* <sup>R</sup>(λ) > 0*.*

Theorem 1 establishes an existential guarantee that the solution to the linear delayed dynamics approaches the zero equilibrium exponentially for any initial conditions in Cr. To achieve automatic safety verification, however, we ought to find a constructive means of estimating the (signed) rate of convergence α and the coefficient K in Eq. (5). This motivates the introduction of the so-called *fundamental solution ξφ*- (t) to Eq. (3), whose Laplace transform will later be shown to be h−1(z), the inverse characteristic matrix, which always exists for <sup>z</sup> satisfying <sup>R</sup>(z) <sup>&</sup>gt; maxλ∈<sup>σ</sup> <sup>R</sup>(λ).

**Lemma 2 (Variation-of-constants** [19,36]**).** *Let ξφ* (t) *be the solution to Eq.* (3)*. Denote by ξφ*- (t) *the solution that satisfies Eq.* (3) *for* t <sup>≥</sup> 0 *and satisfies a variation of the initial condition as φ* (0) = I *and φ* (t) = O *for all* t <sup>∈</sup> [−r, 0)*, where* O *is the* n <sup>×</sup> n *zero matrix, then for* t <sup>≥</sup> 0*,*

$$\mathfrak{E}\_{\phi}(t) = \mathfrak{E}\_{\phi'}(t)\phi(0) + \int\_0^t \mathfrak{E}\_{\phi'}(t-\tau)B\phi(\tau-r)\,\mathrm{d}\tau.\tag{6}$$

Note that in Eq. (6), *φ*(t) is extended to [−r,∞) by making it zero for t > 0. In spite of the discontinuity of *<sup>φ</sup>* at zero, the existence of the solution *ξφ*- (t) can be proven by the well-known method of steps [8].

**Lemma 3 (Fundamental solution** [19]**).** *The solution ξφ*- (t) *to Eq.* (3) *with initial data <sup>φ</sup> is the fundamental solution; that is for* <sup>z</sup> *s.t.* <sup>R</sup>(z) <sup>&</sup>gt; max<sup>λ</sup>∈<sup>σ</sup> <sup>R</sup>(λ)*,*

$$
\mathcal{L}\{\xi\_{\phi'}\}(z) = h^{-1}(z).
$$

The fundamental solution *ξφ*- (t) can be proven to share the same exponential bound as that in Theorem 1, while the following theorem, as a consequence of Lemma 2, gives an exponential estimation of *ξφ* (t) in connection with *ξφ*- (t):

**Theorem 2 (Exponential estimation** [36]**).** *Denote by* <sup>μ</sup> = max <sup>λ</sup>∈<sup>σ</sup> <sup>R</sup>(λ) *the maximum real part of eigenvalues in the spectrum. Then for any* α>μ*, there exists* K > 0 *such that*

$$\|\xi\_{\phi'}(t)\| \le K e^{\alpha t}, \quad \forall t \ge 0,\tag{7}$$

*and hence by Eq.* (6)*, ξφ* (t) ≤ <sup>K</sup> 1 + B r <sup>0</sup> <sup>e</sup><sup>−</sup>ατ <sup>d</sup><sup>τ</sup> *φ* eαt *for any* t <sup>≥</sup> 0 *and φ* ∈ C<sup>r</sup>*. In particular,* **<sup>x</sup>** <sup>=</sup> **<sup>0</sup>** *is globally exponentially stable for Eq.* (3) *if* μ < <sup>0</sup>*.*

Following Theorem 2, an exponentially decreasing bound on the solution *ξφ* (t) to linear DDEs of the form (3) can be assembled by computing α satisfying μ<α< 0 and the coefficient K > 0.

#### **3.1 Identifying the Rightmost Roots**

Due to the significance of characteristic roots in the context of stability and bifurcation analysis, numerical methods on identifying—particularly the rightmost—roots of linear (or linearized) DDEs have been extensively studied in the past few decades, see e.g., [3,11,43,45]. There are indeed complete methods on isolating real roots of polynomial exponential functions, for instances [37] and [15] based on cylindrical algebraic decomposition (CAD). Nevertheless, as soon as non-trivial exponential functions arise in the characteristic equation, there appear to be few, if any, symbolic approaches to detecting complex roots of the equation.

In this paper, we find α that bounds the spectrum from the right of the complex plane, by resorting to the numerical approach developed in [11]. The computation therein employs discretization of the solution operator using linear multistep (LMS) methods to approximate eigenvalues of linear DDEs with multiple constant delays, under an absolute error of <sup>O</sup> (τ <sup>p</sup>) for sufficiently small stepsize <sup>τ</sup> , where <sup>O</sup> (·) is the big Omicron notation and p depends on the order of the LMS-methods. A well-developed MATLAB package called DDE-BIFTOOL [10] is furthermore available to mechanize the computation, which will be demonstrated in our forthcoming examples.

### **3.2 Constructing** *K*

By the inverse Laplace transform (cf. Theorem 5.2 in [19] for a detailed proof), we have *ξφ*- (t) = lim<sup>V</sup> →∞ <sup>1</sup> 2πi <sup>α</sup>+i<sup>V</sup> <sup>α</sup>−i<sup>V</sup> <sup>e</sup>zth<sup>−</sup><sup>1</sup>(z) d<sup>z</sup> for <sup>z</sup> satisfying <sup>R</sup>(z) > μ, where <sup>α</sup> is the exponent associated with the bound on *ξφ*- (t) in Eq. (7), and hence by substituting z = α + iν, we have

$$\mathbf{e}^{-\alpha t} \xi\_{\phi'}(t) = \lim\_{V \to \infty} \frac{1}{2\pi} \int\_{-V}^{V} \mathbf{e}^{\mathbf{i}\nu t} h^{-1}(\alpha + \mathbf{i}\nu) \,\mathrm{d}\nu.$$

Since h<sup>−</sup><sup>1</sup>(z) = <sup>I</sup> <sup>z</sup> <sup>+</sup> h<sup>−</sup><sup>1</sup>(z) <sup>−</sup> <sup>I</sup> z = <sup>I</sup> <sup>z</sup> <sup>+</sup> <sup>O</sup> 1/z<sup>2</sup> , together with the fact that an integral over a quadratic integrand is convergent, it follows that

$$\mathbf{e}^{-\alpha t}\xi\_{\phi'}(t) = \lim\_{V \to \infty} \frac{1}{2\pi} \int\_{-V}^{V} \mathbf{e}^{\mathbf{i}\nu t} \frac{I}{\alpha + \mathbf{i}\nu} \,\mathrm{d}\nu + \frac{1}{2\pi} \int\_{-\infty}^{\infty} \mathbf{e}^{\mathbf{i}\nu t} \mathcal{O}\left(\frac{1}{(\alpha + \mathbf{i}\nu)^2}\right) \,\mathrm{d}\nu.$$

By taking the norm while observing that e<sup>i</sup>νt = 1, we get

$$\|\mathbf{e}^{-\alpha t} \|\boldsymbol{\xi}\_{\phi'}(t)\| \leq \left\| \left\| \lim\_{V \to \infty} \frac{1}{2\pi} \underbrace{\int\_{-V}^{V} \mathbf{e}^{\mathrm{i}\nu t} \frac{I}{\alpha + \mathrm{i}\nu} \,\mathrm{d}\nu}\_{\text{(8-a)}} \right\| + \frac{1}{2\pi} \underbrace{\int\_{-\infty}^{\infty} \left\| \mathcal{O}\left(\frac{1}{(\alpha + \mathrm{i}\nu)^{2}}\right)\right\| \,\mathrm{d}\nu}\_{\text{(8-b)}}.\tag{8}$$

For the integral (8-a), the fact1 that

$$\int\_{-\infty}^{\infty} \frac{\mathbf{e}^{\mathrm{i}ax}}{b + \mathrm{i}x} \, \mathrm{d}x = \int\_{-\infty}^{\infty} \frac{\mathbf{e}^{\mathrm{i}x}}{ab + \mathrm{i}x} \, \mathrm{d}x = \begin{cases} 2\pi \mathrm{e}^{-ab} & \text{if } a, b > 0 \\ 0 & \text{if } a > 0, b < 0, \end{cases} \tag{9}$$

implies

$$\left\| \left\| \lim\_{V \to \infty} \frac{1}{2\pi} \int\_{-V}^{V} \mathbf{e}^{\mathbf{i}\nu t} \frac{I}{\alpha + \mathbf{i}\nu} \, \mathrm{d}\nu \right\| \le \begin{cases} 1, & \forall t > 0, \,\forall \alpha > 0 \\ 0, & \forall t > 0, \,\forall \alpha < 0. \end{cases} \right. \tag{10}$$

Notice that the second integral (8-b) is computable, since it is convergent and independent of t. The underlying computation of the *improper integral*, however, can be rather time-consuming. We therefore detour by computing an upper bound of (8-b) in the form of a *definite integral*, due to Lemma 4, which suffices to constitute an exponential estimation of *ξφ*- (t) while reducing computational efforts pertinent to the integration.

<sup>1</sup> The integral in (9) is divergent for a = 0 or b = 0 in the sense of a Riemann integral.

**Lemma 4.** *There exists* M > 0 *such that inequation* (11) *below holds for any* α>μ*.*

$$\int\_{-\infty}^{\infty} \left\| \mathcal{O} \left( \frac{1}{(\alpha + \mathbf{i}\nu)^2} \right) \right\| \, \mathrm{d}\nu \le \int\_{-M}^{M} \left\| \mathcal{O} \left( \frac{1}{(\alpha + \mathbf{i}\nu)^2} \right) \right\| \, \mathrm{d}\nu + \frac{8n}{M} \left( \|A\| + \|B\| \, \mathrm{e}^{-r\alpha} \right) \tag{11}$$

*where* <sup>μ</sup> = max <sup>λ</sup>∈<sup>σ</sup> <sup>R</sup>(λ)*,* <sup>z</sup> <sup>=</sup> <sup>α</sup> + iν*, and* <sup>n</sup> *is the order of* <sup>A</sup> *and* <sup>B</sup>*.*

*Proof.* The proof depends essentially on constructing a threshold M > 0 such that the integral over <sup>|</sup>ν<sup>|</sup> > M can be bounded, thus transforming the improper integral in question to a definite one. To find such an M, observe that

$$\left\|\mathcal{O}\left(\frac{1}{z^2}\right)\right\| = \left\|h^{-1}(z) - \frac{I}{z}\right\| = \left\|h^{-1}(z)\right\| \left\|I - \frac{h(z)}{z}\right\| \le \frac{\left\|h^{-1}(z)\right\|}{|z|} (\|A\| + \|B\| \|e^{-r\alpha}).$$

Without loss of generality, suppose the entry of h<sup>−</sup><sup>1</sup>(z) at (i, j) takes the form

$$\begin{aligned} \left(h^{-1}\right)\_{ij} = \left(\sum\_{k=0}^{n-1} p\_k^{ij} (\mathbf{e}^{-rz}) z^k\right) / \det(h(z)) &= \left(\sum\_{k=0}^{n-1} p\_k^{ij} (\mathbf{e}^{-rz}) z^k\right) / (z^n + \sum\_{k=0}^{n-1} q\_k (\mathbf{e}^{-rz}) z^k) \\ &= \frac{1}{z} \left(\sum\_{k=0}^{n-1} p\_k^{ij} (\mathbf{e}^{-rz}) z^{k-n+1}\right) / (1 + \sum\_{k=0}^{n-1} q\_k (\mathbf{e}^{-rz}) z^{k-n}), \end{aligned}$$

where pij <sup>k</sup> (·) and <sup>q</sup><sup>k</sup>(·) are polynomials in <sup>e</sup><sup>−</sup>rz as coefficients of <sup>z</sup><sup>k</sup>. Since <sup>e</sup><sup>−</sup>rz is bounded by e<sup>−</sup>rα along the vertical line <sup>z</sup> <sup>=</sup> <sup>α</sup> + iν, we can conclude that there exist Pij <sup>k</sup> and <sup>Q</sup><sup>k</sup> such that pij <sup>k</sup> (e<sup>−</sup>rz) <sup>≤</sup> <sup>P</sup>ij <sup>k</sup> and <sup>|</sup>q<sup>k</sup>(e<sup>−</sup>rz)| ≤ <sup>Q</sup><sup>k</sup>, with <sup>P</sup>ij <sup>n</sup>−<sup>1</sup> = 1 if <sup>i</sup> <sup>=</sup> <sup>j</sup>, and 0 otherwise. Furthermore, in the vertical line z = α + iν, if <sup>|</sup>ν| ≥ 1, then

$$\begin{aligned} \left| \sum\_{k=0}^{n-1} p\_k^{ij} (\mathbf{e}^{-rz}) z^{k-n+1} \right| &\leq \left| p\_{n-1}^{ij} (\mathbf{e}^{-rz}) \right| + \sum\_{k=0}^{n-2} \left| p\_k^{ij} (\mathbf{e}^{-rz}) z^{-1} \right| \leq P\_{n-1}^{ij} + \sum\_{k=0}^{n-2} P\_k^{ij} \left| z^{-1} \right|, \\\left| 1 + \sum\_{k=0}^{n-1} q\_k (\mathbf{e}^{-rz}) z^{k-n} \right| &\geq 1 - \sum\_{k=0}^{n-1} \left| q\_k (\mathbf{e}^{-rz}) \right| \left| z^{k-n} \right| \geq 1 - \sum\_{k=0}^{n-1} Q\_k \left| z^{-1} \right|. \end{aligned}$$

We can thus choose <sup>|</sup>ν<sup>|</sup> > M = max <sup>1</sup>≤i,j≤<sup>n</sup> 1, 2 n-−1 k=0 Q<sup>k</sup>, n-−2 k=0 Pij k , which implies

$$\begin{aligned} \left| \left( \sum\_{k=0}^{n-1} p\_k^{\mathrm{ij}} (\mathbf{e}^{-rz}) z^k \right) / \det(h(z)) \right| &\leq \left| \frac{1}{z} (\sum\_{k=0}^{n-1} p\_k^{\mathrm{ij}} (\mathbf{e}^{-rz}) z^{k-n+1}) / (1 + \sum\_{k=0}^{n-1} q\_k (\mathbf{e}^{-rz}) z^{k-n}) \right| \\ &\leq \left| \frac{1}{z} \right| (P\_{n-1}^{\mathrm{ij}} + \sum\_{k=0}^{n-2} P\_k^{\mathrm{ij}} \left| z^{-1} \right|) / (1 - \sum\_{k=0}^{n-1} Q\_k \left| z^{-1} \right|) \leq \frac{2}{|z|} (1 + P\_{n-1}^{\mathrm{ij}}) \leq \frac{4}{|z|}, \end{aligned}$$

where the third inequality holds since <sup>|</sup>ν<sup>|</sup> > M. It then follows, if <sup>|</sup>ν<sup>|</sup> > M, that

$$\left\| \mathcal{O} \left( \frac{1}{(\alpha + \mathbf{i}\nu)^2} \right) \right\| \le \frac{\left\| h^{-1}(z) \right\|}{|z|} (\|A\| + \|B\| \|\mathbf{e}^{-r\alpha}) \le \frac{4n}{\nu^2} (\|A\| + \|B\| \|\mathbf{e}^{-r\alpha}),$$

and thereby

$$\begin{split} \left\| \int\_{-\infty}^{\infty} \left\| \mathcal{O} \left( \frac{1}{(\alpha + \mathbf{i}\nu)^{2}} \right) \right\| \leq \int\_{-M}^{M} \left\| \mathcal{O} \left( \frac{1}{(\alpha + \mathbf{i}\nu)^{2}} \right) \right\| \, \mathrm{d}\nu + 2 \int\_{M}^{\infty} \frac{4n}{\nu^{2}} (\left\| A \right\| + \left\| B \right\| \, \mathrm{e}^{-r\alpha}) \, \mathrm{d}\nu \\ \leq \int\_{-M}^{M} \left\| \mathcal{O} \left( \frac{1}{(\alpha + \mathbf{i}\nu)^{2}} \right) \right\| \, \mathrm{d}\nu + \frac{8n}{M} \left( \left\| A \right\| + \left\| B \right\| \, \mathrm{e}^{-r\alpha} \right) . \end{split}$$

This completes the proof. 

Equations (8), (10) and (11) yield that e−αt *ξφ*- (t) is upper-bounded by

$$K = \frac{1}{2\pi} \left( \int\_{-M}^{M} \left\| \mathcal{O} \left( \frac{1}{(\alpha + \mathbf{i}\nu)^2} \right) \right\| \, \mathrm{d}\nu + \frac{8n}{M} \left( \|A\| + \|B\| \, \mathrm{e}^{-r\alpha} \right) \right) + 1\_0(\alpha), \tag{12}$$

for all t > <sup>0</sup>. Here <sup>M</sup> is the constant given in Lemma 4, while <sup>1</sup><sup>0</sup> : (μ,∞) \ {0} → {0, 1} is an indicator function<sup>2</sup> of {<sup>α</sup> <sup>|</sup> α > <sup>0</sup>}, i.e., <sup>1</sup><sup>0</sup>(α)=1 for α > <sup>0</sup> and <sup>1</sup><sup>0</sup>(α)=0 for μ<α< 0.

In contrast to the existential estimation guarantee established in Theorem 2, exploiting the construction of α and K gives a constructive quantitative criterion permitting to reduce an unbounded safety verification problem to its bounded counterpart:

**Theorem 3 (Equivalence of bounded and unbounded safety).** *Given* X ⊆ <sup>R</sup><sup>n</sup> *a set of initial states and* U ⊆ <sup>R</sup><sup>n</sup> *a set of bad states satisfying* **<sup>0</sup>** <sup>∈</sup>/ <sup>U</sup>*, suppose we have* <sup>α</sup> *satisfying* μ<α< 0 *and* K *from Eq.* (12)*. Let* K<sup>ˆ</sup> = K 1 + B r <sup>0</sup> <sup>e</sup><sup>−</sup>ατ <sup>d</sup><sup>τ</sup> X *, then there exists* T <sup>∗</sup> <sup>&</sup>lt; <sup>∞</sup>*, defined as*

$$T^\* \hat{=} \max\{0, \inf\{T \mid \forall t > T \colon [-\hat{K}\mathbf{e}^{\alpha t}, \hat{K}\mathbf{e}^{\alpha t}]^n \cap \mathcal{U} = \emptyset\}\},\tag{13}$$

*such that for any* T >T <sup>∗</sup>*, the system* (3) *is* <sup>∞</sup>*-safe iff it is* <sup>T</sup>*-safe.*

*Proof.* The "only if" part is for free, as <sup>∞</sup>-safety subsumes by definition T-safety. For the "if" direction, the constructed K in Eq. (12) suffices as an upper bound of e<sup>−</sup>αt *ξφ*- (t), and hence by Theorem 2, *ξφ* (t) ≤ <sup>K</sup><sup>ˆ</sup> <sup>e</sup>αt for any <sup>t</sup> <sup>≥</sup> <sup>0</sup> and *<sup>φ</sup>* constrained by <sup>X</sup> . As a consequence, it suffices to show that T <sup>∗</sup> given by Eq. (13) is finite, which then by definition implies that system (3) is safe over t>T <sup>∗</sup>. Note that the assumption **<sup>0</sup>** <sup>∈</sup>/ <sup>U</sup> implies that there exists a ball <sup>B</sup>(**0**, δ) such that <sup>B</sup>(**0**, δ) ∩ U = <sup>∅</sup>. Moreover, K<sup>ˆ</sup> eαt is strictly monotonically decreasing w.r.t. <sup>t</sup>, and thus T = max{0, ln(δ/K<sup>ˆ</sup> )/α} is an upper bound3 of <sup>T</sup> <sup>∗</sup>, which further implies <sup>T</sup> <sup>∗</sup> <sup>&</sup>lt; <sup>∞</sup>. 

*Example 2 (PD-controller* [17]*).* Consider a PD-controller with linear dynamics defined, for t <sup>≥</sup> 0, as

$$
\dot{y}(t) = v(t); \quad \dot{v}(t) = -\kappa\_p \left( y(t-r) - y^\* \right) - \kappa\_d v(t-r), \tag{14}
$$

which controls the position y and velocity v of an autonomous vehicle by adjusting its acceleration according to the current distance to a reference position y<sup>∗</sup>. A constant time

<sup>2</sup> We rule out the case of α = 0, which renders the integral in Eq. (12) divergent.

<sup>3</sup> Note that the larger δ is, the tighter bound T will be.

delay r is introduced to model the time lag due to sensing, computation, transmission, and/or actuation. We instantiate the parameters following [17] as <sup>κ</sup><sup>p</sup> = 2, <sup>κ</sup><sup>d</sup> = 3, y<sup>∗</sup> = 1, and r = 0.35. The system described by Eq. (14) then has one equilibrium at (1; 0), which shares equivalent stability with the zero equilibrium of the following system, with yˆ = y <sup>−</sup> 1 and vˆ = v:

$$
\dot{\hat{y}}(t) = \hat{v}(t); \quad \dot{\hat{v}}(t) = -2\hat{y}(t-r) - 3\hat{v}(t-r). \tag{15}
$$

Suppose we are interested in exploiting the safety property of the system (15) in an unbounded time domain, relative to the set of initial states <sup>X</sup> = [−0.1, 0.1] <sup>×</sup> [0, 0.1] and the set of unsafe states <sup>U</sup> = {(ˆy; ˆv) | |yˆ<sup>|</sup> > 0.2}. Following our construction process, we obtain automatically some key arguments (depicted in Fig. 1) as α = <sup>−</sup>0.5, M = 11.9125, K = 7.59162 and K<sup>ˆ</sup> = 2.21103, which consequently yield T <sup>∗</sup> <sup>=</sup> 4.80579 s. By Theorem 3, the unbounded safety verification problem thus is reduced to <sup>a</sup> T-bounded one for any T >T <sup>∗</sup>, inasmuch as <sup>∞</sup>-safety is equivalent to <sup>T</sup>-safety for the underlying dynamics.

[−K<sup>ˆ</sup> eαt, K<sup>ˆ</sup> eαt] <sup>n</sup> in Eq. (13) can be viewed as an overapproximation of all trajectories originating from X . As shown in the right part of Fig. 1, this overapproximation, however, is obviously too conservative to be utilized in proving or disproving almost any safety specifications of practical interest. The contribution of our approach lies in the reduction of unbounded verification problems to their bounded counterparts, thereby yielding a quantitative time bound T <sup>∗</sup> that substantially "trims off" the verification efforts pertaining to t>T <sup>∗</sup>. The derived <sup>T</sup>-safety verification task can be tackled effectively by methods dedicated to bounded verification of DDEs of the form (3), or more generally, (1), e.g., approaches in [17] and [4].

**Fig. 1.** Left: the identified rightmost roots of h(z) in DDE-BIFTOOL and an upper bound α = −0.5 such that max<sup>λ</sup>∈<sup>σ</sup> R(λ) <α< 0; Center: M = 11.9125 that suffices to split and hence upper-bound the improper integral ∞ −∞ - -O 1/z<sup>2</sup>- dν in Eq. (11); Right: the obtained time instant T <sup>∗</sup> = 4.80579 s guaranteeing the equivalence of ∞-safety and T-safety of the PD-controller, for any T >T <sup>∗</sup>.

#### **4 Nonlinear Dynamics**

In this section, we address a more general form of dynamics featuring substantial nonlinearity, by resorting to linearization techniques and thereby establishing a quantitative stability criterion, analogous to the linear case, for nonlinear delayed dynamics.

Consider a singly delayed version of Eq. (1):

$$\begin{cases} \dot{\mathbf{x}}\left(t\right) = \mathbf{f}\left(\mathbf{x}\left(t\right), \mathbf{x}\left(t-r\right)\right), \quad t \in [0, \infty) \\ \mathbf{x}\left(t\right) = \boldsymbol{\phi}\left(t\right), \quad t \in [-r, 0] \end{cases} \tag{16}$$

with *f* being a nonlinear vector field involving possibly non-polynomial functions. Let

$$f\left(\mathbf{x}, \mathbf{y}\right) = A\mathbf{x} + B\mathbf{y} + g(\mathbf{x}, \mathbf{y}), \text{ with } A = f\_{\mathbf{x}}\left(\mathbf{0}, \mathbf{0}\right), B = f\_{\mathbf{y}}\left(\mathbf{0}, \mathbf{0}\right),$$

where *<sup>f</sup>***<sup>x</sup>** and *<sup>f</sup>***<sup>y</sup>** are the Jacobian matrices of *<sup>f</sup>* in terms of **<sup>x</sup>** and **<sup>y</sup>**, respectively; *<sup>g</sup>* is a vector-valued, high-order term whose Jacobian matrix at (**0**, **<sup>0</sup>**) is O.

By dropping the high-order term *g* in *f*, we get the linearized counterpart of Eq. (16):

$$\begin{cases} \dot{\mathbf{x}}(t) = A\mathbf{x}\left(t\right) + B\mathbf{x}\left(t - r\right), \quad t \in [0, \infty) \\ \mathbf{x}(t) = \phi\left(t\right), \quad t \in [-r, 0] \end{cases} \tag{17}$$

which falls in the scope of linear dynamics specified in Eq. (3), and therefore is associated with a characteristic equation of the same form as that in Eq. (4). Equation (17) will be in the sequel referred to as the linearization of Eq. (16) at the steady state **0**, and σ is used to denote the spectrum of the characteristic equation corresponding to Eq. (17).

In light of the well-known Hartman-Grobman theorem [18,20] in the realm of dynamical systems, the local behavior of a nonlinear dynamical system near a (hyperbolic) equilibrium is qualitatively the same as that of its linearization near this equilibrium. The following statement uncovers the connection between the locally asymptotic behavior of a nonlinear system and the spectrum of its linearization:

**Theorem 4 (Locally exponential stability** [6,36]**).** *Suppose* max<sup>λ</sup>∈<sup>σ</sup> <sup>R</sup>(λ) <α< <sup>0</sup>*. Then* **<sup>x</sup>** = **<sup>0</sup>** *is a locally exponentially stable equilibrium of the nonlinear systems* (16)*. In fact, there exists* δ > 0 *and* K > 0 *such that*

$$\|\|\phi\|\| \le \delta \implies \|\|\xi\_{\phi}(t)\|\| \le K \left\|\|\phi\|\right\| \text{e}^{\alpha t/2}, \quad \forall t \ge 0,$$

*where ξφ* (t) *is the solution to Eq.* (16)*. If* <sup>R</sup>(λ) <sup>&</sup>gt; <sup>0</sup> *for some* <sup>λ</sup> *in* <sup>σ</sup>*, then* **<sup>x</sup>** <sup>=</sup> **<sup>0</sup>** *is unstable.*

Akin to the linear case, Theorem 4 establishes an existential guarantee that the solution to the nonlinear delayed dynamics approaches the zero equilibrium exponentially for initial conditions within a δ-neighborhood of this equilibrium. The need of constructing α, K and δ quantitatively in Theorem 4, as essential to our automatic verification approach, invokes again the fundamental solution *ξφ*- (t) to the linearized dynamics in Eq. (17):

**Lemma 5 (Variation-of-constants** [19,36]**).** *Consider nonhomogeneous systems of the form*

$$\begin{cases} \dot{\mathbf{x}}\left(t\right) = A\mathbf{x}\left(t\right) + B\mathbf{x}\left(t - r\right) + \eta\left(t\right), \quad t \in [0, \infty) \\ \mathbf{x}\left(t\right) = \phi\left(t\right), \quad t \in [-r, 0] \end{cases} \tag{18}$$

*Let ξφ* (t) *be the solution to Eq.* (18)*. Denote by ξφ*- (t) *the solution that satisfies Eq.* (17) *for* t <sup>≥</sup> 0 *and satisfies a variation of the initial condition as φ* (0) = I *and φ* (t) = O *for all* t <sup>∈</sup> [−r, 0)*. Then for* t <sup>≥</sup> 0*,*

$$\xi\_{\phi}(t) = \xi\_{\phi'}(t)\phi(0) + \int\_0^t \xi\_{\phi'}(t-\tau)B\phi(\tau-r)\,\mathrm{d}\tau + \int\_0^t \xi\_{\phi'}(t-\tau)\eta(\tau)\,\mathrm{d}\tau,\tag{19}$$

*where φ is extended to* [−r,∞) *with φ*(t)=0 *for* t > 0*.*

In what follows, we give a constructive quantitative estimation of the solutions to nonlinear dynamics, which admits a reduction of the problem of constructing an exponential upper bound of a nonlinear system to that of its linearization, as being immediately evident from the constructive proof.

**Theorem 5 (Exponential estimation).** *Suppose that* max<sup>λ</sup>∈<sup>σ</sup> <sup>R</sup>(λ) <α< <sup>0</sup>*. Then there exist* K > 0 *and* δ > 0 *such that ξφ*- (t) ≤ Keαt *for any* t <sup>≥</sup> 0*, and*

$$\|\phi\| \le \delta \implies \|\xi\_{\phi}(t)\| \le K e^{-r\alpha} \left(1 + \|B\| \int\_0^r \mathbf{e}^{-\alpha \tau} \, \mathrm{d}\tau\right) \|\phi\| \, \mathrm{e}^{\alpha t/2}, \quad \forall t \ge 0,$$

*where ξφ* (t) *is the solution to nonlinear systems* (16) *and ξφ*- (t) *is the fundamental solution to the linearized counterpart* (17)*.*

*Proof.* The existence of K follows directly from Eq. (7) in Theorem 2. By the variationof-constants formula (19), we have, for t <sup>≥</sup> 0,

$$\xi\_{\phi}(t) = \xi\_{\phi'}(t)\phi(0) + \int\_0^t \xi\_{\phi'}(t-\tau)B\phi(\tau-r)\,\mathrm{d}\tau + \int\_0^t \xi\_{\phi'}(t-\tau)g(\mathbf{x}(\tau), \mathbf{x}(\tau-r))\,\mathrm{d}\tau,\tag{20}$$

where *φ* is extended to [−r,∞) with *φ*(t)=0 for t > 0. Define **<sup>x</sup>***<sup>φ</sup>* <sup>t</sup> (·) ∈ C<sup>r</sup> as **x***φ* <sup>t</sup> (θ) = *ξφ* (<sup>t</sup> <sup>+</sup> <sup>θ</sup>) for <sup>θ</sup> <sup>∈</sup> [−r, 0]. Then *<sup>g</sup>*(·, ·) being a higher-order term yields that for any > 0, there exists δ <sup>&</sup>gt; <sup>0</sup> such that **x***<sup>φ</sup>* t <sup>≤</sup> δ implies *<sup>g</sup>* (**x**(t), **<sup>x</sup>**(<sup>t</sup> <sup>−</sup> <sup>r</sup>)) <sup>≤</sup> **x***<sup>φ</sup>* t . Due to the fact that *ξφ*- (t) ≤ Keαt and the monotonicity of *ξφ*- (t) with α < 0, we have **x***φ*- t <sup>≤</sup> <sup>K</sup>e<sup>α</sup>(t−r). This, together with Eq. (20), leads to

$$\begin{split} \left\| \mathbf{x}\_{t}^{\boldsymbol{\Phi}} \right\| &\leq K \left\| \boldsymbol{\Phi} \right\| \mathbf{e}^{\alpha \{t-r\}} + \int\_{0}^{r} K \left\| B \right\| \left\| \boldsymbol{\Phi} \right\| \mathbf{e}^{\alpha \{t-r\}} \mathbf{e}^{-\alpha \tau} \, \mathrm{d}\tau + \int\_{0}^{t} K \mathbf{e}^{\alpha \{t-r\}} \mathbf{e}^{-\alpha \tau} \epsilon \left\| \mathbf{x}\_{\tau}^{\boldsymbol{\Phi}} \right\| \, \mathrm{d}\tau \\ &= K \left( 1 + \left\| B \right| \int\_{0}^{r} \mathbf{e}^{-\alpha \tau} \, \mathrm{d}\tau \right) \left\| \boldsymbol{\Phi} \right\| \, \mathrm{e}^{\alpha \{t-r\}} + \epsilon K \mathbf{e}^{\alpha \{t-r\}} \int\_{0}^{t} \mathbf{e}^{-\alpha \tau} \left\| \mathbf{x}\_{\tau}^{\boldsymbol{\Phi}} \right\| \, \mathrm{d}\tau. \end{split}$$

Hence,

$$\left| \operatorname{e}^{-\alpha t} \left| \left\| \mathbf{x}\_{t}^{\phi} \right\| \right| \leq K \operatorname{e}^{-r\alpha} \left( 1 + \left\| B \right\| \int\_{0}^{r} \operatorname{e}^{-\alpha \tau} \operatorname{d} \tau \right) \left\| \phi \right\| + \epsilon K \operatorname{e}^{-r\alpha} \int\_{0}^{t} \operatorname{e}^{-\alpha \tau} \left\| \mathbf{x}\_{\tau}^{\phi} \right\| \operatorname{d} \tau. \right)$$

By the Gronwall-Bellman inequality [ ¨ 1] we obtain

$$\|\mathbf{e}^{-\alpha t}\| \|\mathbf{x}^{\phi}\_{t}\| \le K \mathbf{e}^{-r\alpha} \left(1 + \|B\| \int\_{0}^{r} \mathbf{e}^{-\alpha \tau} \, \mathrm{d}\tau\right) \|\phi\| \, \mathbf{e}^{\epsilon K \mathbf{e}^{-r\alpha} t}$$

and thus

$$\|\mathbf{x}\_t^{\phi}\| \le K \mathbf{e}^{-r\alpha} \left(1 + \|B\| \int\_0^r \mathbf{e}^{-\alpha \tau} \, \mathrm{d}\tau\right) \|\phi\| \, \mathbf{e}^{\epsilon K e^{-r\alpha} t + \alpha t}.$$

Set ≤ −α/(2Ke<sup>−</sup>rα) and δ = min δ , δ / Ke<sup>−</sup>rα 1 + B r <sup>0</sup> <sup>e</sup><sup>−</sup>ατ <sup>d</sup><sup>τ</sup> . This yields, for any t <sup>≥</sup> 0,

$$\|\|\phi\|\| \le \delta \implies \|\xi\_{\phi}(t)\|\| \le K e^{-r\alpha} \left(1 + \|B\| \int\_0^r \mathbf{e}^{-\alpha \tau} \, \mathrm{d}\tau\right) \|\phi\|\| \, \mathrm{e}^{\alpha t/2},$$

completing the proof. 

The above constructive quantitative estimation of the solutions to nonlinear dynamics gives rise to the reduction, analogous to the linear case, of unbounded verification problems to bounded ones, in the presence of a local stability criterion.

**Theorem 6 (Equivalence of safety properties).** *Given initial state set* X ⊆ <sup>R</sup><sup>n</sup> *and bad states* U ⊆ <sup>R</sup><sup>n</sup> *satisfying* **<sup>0</sup>** <sup>∈</sup>/ <sup>U</sup>*. Let* <sup>σ</sup> *denote the spectrum of the characteristic equation corresponding to Eq.* (17)*. Suppose that* max<sup>λ</sup>∈<sup>σ</sup> <sup>R</sup>(λ) <α< <sup>0</sup>*, and the fundamental solution to Eq.* (17) *satisfies ξφ*- (t) ≤ Keαt *for any* t <sup>≥</sup> 0*. Let* K˜ = Ke<sup>−</sup>rα 1 + B r <sup>0</sup> <sup>e</sup><sup>−</sup>ατ <sup>d</sup><sup>τ</sup> X *. Then there exists* δ > 0 *and* T <sup>∗</sup> < <sup>∞</sup>*, defined as*

$$T^\* \supseteq \max\{0, \inf\{T \mid \forall t > T \colon [-\tilde{K}\mathrm{e}^{\alpha t/2}, \tilde{K}\mathrm{e}^{\alpha t/2}]^n \cap \mathcal{U} = \emptyset\}\},$$

*such that if* X ≤ δ*, then for any* T >T <sup>∗</sup>*, the system* (16) *is* <sup>∞</sup>*-safe iff it is* <sup>T</sup>*-safe.*

*Proof.* The proof is analogous to that of Theorem 3, particularly following from the local stability property stated in Theorem 5. 

Note that for nonlinear dynamics, the equivalence of safety claimed by Theorem 6 holds on the condition that X ≤ δ, due to the locality stemming from linearization. In fact, such a set <sup>B</sup> <sup>⊆</sup> <sup>R</sup><sup>n</sup> satisfying B ≤ <sup>δ</sup> describes (a subset of) the basin of attraction around the local *attractor* **0**, in a sense that any initial condition in B will lead the trajectory eventually into the attractor. Consequently, for verification problems where X ⊇ B, if the reachable set originating from X is guaranteed to be subsumed within <sup>B</sup> in the time interval [T <sup>−</sup> r, T ], then T + T <sup>∗</sup> suffices as a bound to avoid unbounded verification, namely for any T >T <sup>+</sup> <sup>T</sup> <sup>∗</sup>, the system is <sup>∞</sup>-safe iff it is T-safe. This is furthermore demonstrated by the following example.

*Example 3 (Population dynamics* [4,25]*).* Consider a slightly modified version of the delayed logistic equation introduced by G. Hutchinson in 1948 (cf. [22])

$$
\dot{N}(t) = N(t)[1 - N(t - r)], \quad t \ge 0,\tag{21}
$$

which is used to model a single population whose percapita rate of growth N˙ (t)/N(t) depends on the population size r time units in the past. This would be a reasonable model for a population that features a significant minimum reproductive age or depends on a resource, like food, needing time to grow and thus to recover its availability.

If we change variables, putting u = N <sup>−</sup> 1, then Eq. (21) becomes the famous Wright's equation (see [44]):

$$
\dot{u}(t) = -u(t-r)[1+u(t)], \quad t \ge 0. \tag{22}
$$

The steady state N = 1 is now u = 0. We instantiate the verification problem of Eq. (22) over [−r,∞) as <sup>X</sup> = [−0.2, 0.2], <sup>U</sup> = {u | |u<sup>|</sup> > 0.6}, under a constant delay r = 1. Note that delay-independent Lyapunov techniques, e.g. [32], cannot solve this problem, since Wright's conjecture [44], which has been recently proven in [40], together with corollaries thereof implies that there does not exist a Lyapunov functional guaranteeing absolute stability of Eq. (22) with arbitrary constant delays. To achieve an exponential estimation, we first linearize the dynamics by dropping the nonlinearity u(t)u(t <sup>−</sup> r) thereof:

$$
\dot{v}(t) = -v(t-1), \quad t \ge 0. \tag{23}
$$

Following our constructive approach, we obtain automatically for Eq. (23) α = <sup>−</sup>0.3 (see the left of Fig. 2), M = 2.69972, K = 3.28727, and thereby for Eq. (22) δ = 0.00351678, K˜ = 0.0338039 and T <sup>∗</sup> = 0 s. It is worth highlighting that by the bounded verification method in [17], with Taylor models of the order 5, an overapproximation Ω of the reachable set w.r.t. system (22) over the time interval [14.5, 15.5] was verified to be enclosed in the δ-neighborhood of **<sup>0</sup>**, i.e., Ω ≤ δ, yet escaped from this region around t = 55.3 s, and tended to diverge soon, as depicted in the right part of Fig. 2, and thus cannot prove unbounded safety properties. However, with our result of T <sup>∗</sup> = 0<sup>s</sup> and the fact that Ω over [−1, 15.5] is disjoint with <sup>U</sup>, we are able to claim safety of the underlying system over an infinite time domain.

**DDEs with Multiple Different Delays.** Delay differential equations with multiple fixed discrete delays are extensively used in the literature to model practical systems where components coupled with different time lags coexist and interact with each other. We remark that previous theorems on exponential estimation and equivalence of safety w.r.t. cases of single delay extend immediately to systems of the form (1) with almost no change, except for replacing B e<sup>−</sup>rα with k <sup>i</sup>=1 A<sup>i</sup> <sup>e</sup><sup>−</sup>ri<sup>α</sup> and B with k <sup>i</sup>=1 A<sup>i</sup>, where <sup>A</sup><sup>i</sup> denotes the matrix attached to **<sup>x</sup>**(<sup>t</sup> <sup>−</sup> <sup>r</sup><sup>i</sup>) in the linearization. For a slightly modified form of the variation-of-constants formula under multiple delays, we refer the readers to Theorem 1.2 in [19].

**Fig. 2.** Left: the identified rightmost eigenvalues of h(z) and an upper bound α = −0.5 such that max<sup>λ</sup>∈<sup>σ</sup> R(λ) <α< 0; Right: overapproximation of the reachable set of the system (22) produced by the method in [17] using Taylor models for bounded verification. Together with this overapproximation we prove the equivalence of ∞-safety and T-safety of the system, for any T > (T + T <sup>∗</sup>) = 15.5 s.

#### **5 Implementation and Experimental Results**

To further investigate the scalability and efficiency of our constructive approach, we have carried out a prototypical implementation<sup>4</sup> in Wolfram MATHEMATICA, which was selected due to its built-in primitives for integration and matrix operations. By interfacing with DDE-BIFTOOL5 (in MATLAB or GNU OCTAVE) for identifying the rightmost characteristic roots of linear (or linearized) DDEs, our implementation computes an appropriate T <sup>∗</sup> that admits a reduction of unbounded verification problems to bounded ones. A set of benchmark examples from the literature has been evaluated on a 3.6 GHz Intel Core-i7 processor with 8 GB RAM running 64-bit Ubuntu 16.04. All computations of T <sup>∗</sup> were safely rounded and finished within 6 s for any of the examples, including Examples 2 and 3. In what follows, we demonstrate in particular the applicability of our technique to DDEs featuring non-polynomial dynamics, high dimensionality and multiple delays.

*Example 4 (Disease pathology* [25,27,32]*).* Consider the following non-polynomial DDE for t <sup>≥</sup> 0:

$$\dot{p}(t) = \frac{\beta \theta^n p(t-r)}{\theta^n + p^n(t-r)} - \gamma p(t),\tag{24}$$

where p(t) is positive and indicates the number of mature blood cells in circulation, while r models the delay between cell production and cell maturation. We consider the case θ = 1 as in [32]. Constants are instantiated as n = 1, β = 0.5, γ = 0.6 and r = 0.5. The unbounded verification problem of Eq. (24) over [−r,∞) is configured as <sup>X</sup> = [0, 0.2] and <sup>U</sup> = {p | |p<sup>|</sup> > 0.3}. Then the linearization of Eq. (24) reads

$$
\dot{p}(t) = -0.6p(t) + 0.5p(t - 0.5). \tag{25}
$$

<sup>4</sup> http://lcs.ios.ac.cn/∼chenms/tools/UDDER.tar.bz2.

<sup>5</sup> http://ddebiftool.sourceforge.net/.

With α = <sup>−</sup>0.07 obtained from DDE-BIFTOOL, our implementation produces for Eq. (25) the values M = 2.23562, K = 1.75081, and thereby for Eq. (24) δ = 0.0163426, K˜ = 0.0371712 and T <sup>∗</sup> = 0 s. Thereafter by the bounded verification method in [17], with Taylor models of the order 5, an overapproximation of the reachable set w.r.t. system (24) over the time interval [25.45, 25.95] was verified to be enclosed in the δ-neighborhood of **<sup>0</sup>**. This fact, together with T <sup>∗</sup> = 0 s and the overapproximation on [−0.5, 25.95] being disjoint with <sup>U</sup>, yields safety of the system (24) over [−0.5,∞).

*Example 5 (Gene regulation* [12,36]*).* To examine the scalability of our technique to higher dimensions, we recall an instantiation of Eq. (2) by setting n = 5, namely with 5 state components **<sup>x</sup>** = (x<sup>1</sup>; ... ; <sup>x</sup><sup>5</sup>) and <sup>5</sup> delay terms **<sup>r</sup>** = (0.1; 0.2; 0.4; 0.8; 1.6) involved, <sup>g</sup>(x) = <sup>−</sup>x, <sup>β</sup><sup>j</sup> = 1 for <sup>j</sup> = 1,..., <sup>5</sup>, <sup>X</sup> <sup>=</sup> <sup>B</sup> ((1; 1; 1; 1; 1), <sup>0</sup>.2) and <sup>U</sup> <sup>=</sup> {**<sup>x</sup>** | |x<sup>1</sup><sup>|</sup> <sup>&</sup>gt; <sup>1</sup>.5}. With <sup>α</sup> <sup>=</sup> <sup>−</sup>0.<sup>04</sup> derived from DDE-BIFTOOL, our implementation returns M = 64.264, K = 4.42207, K<sup>ˆ</sup> = 49.1463 and T <sup>∗</sup> = 87.<sup>2334</sup> s, thereby yielding the equivalence of <sup>∞</sup>-safety to T-safety for any T >T <sup>∗</sup>. Furthermore, the safety guarantee issued by the bounded verification method in [4] based on rigorous simulations under T = 88 s suffices to prove safety of the system over an infinite time horizon.

#### **6 Conclusion**

We have presented a constructive method, based on linearization and spectral analysis, for computing a delay-dependent, exponentially decreasing upper bound, if existent, that encloses trajectories of a DDE originating from a certain set of initial functions. We showed that such an enclosure facilitates a reduction of the verification problem over an unbounded temporal horizon to a bounded one. Preliminary experimental results on a set of representative benchmarks from the literature demonstrate that our technique effectively extends the scope of existing bounded verification techniques to unbounded verification tasks.

Peeking into future directions, we plan to exploit a tight integration of our technique into several automatic tools dedicated to bounded verification of DDEs, as well as more permissive forms of stabilities, e.g. asymptotical stability, that may admit a similar reduction-based idea. An extension of our method to deal with more general forms of DDEs, e.g., with time-varying, or distributed (i.e., a weighted average of) delays, will also be of interest. Additionally, we expect to refine our enclosure of system trajectories by resorting to a topologically finite partition of the initial set of functions.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

#### Author Index

Albarghouthi, Aws I-278 André, Étienne I-520 Arcaini, Paolo I-401 Arcak, Murat I-591 Arechiga, Nikos II-137 Ashok, Pranav I-497 Avni, Guy I-630

Backes, John II-231 Bansal, Suguman I-60 Barbosa, Haniel II-74 Barrett, Clark I-443, II-23, II-74, II-116 Bayless, Sam II-231 Becker, Heiko II-155 Beckett, Ryan II-305 Beillahi, Sidi Mohamed II-286 Berkovits, Idan II-245 Biswas, Ranadeep II-324 Bloem, Roderick I-630 Bouajjani, Ahmed II-267, II-286 Brain, Martin II-116 Breck, Jason I-335 Busatto-Gaston, Damien I-572

Černý, Pavol I-140 Češka, Milan I-475 Chatterjee, Krishnendu I-630 Chen, Mingshuai I-650 Cimatti, Alessandro I-376 Coenen, Norine I-121 Cook, Byron II-231 Cyphert, John I-335

D'Antoni, Loris I-3, I-278, I-335 Damian, Andrei II-344 Darulova, Eva II-155, II-174 Davis, Jennifer A. I-366 Deshmukh, Jyotirmoy II-137 Dill, David L. I-443 Dimitrova, Rayna I-241 Dodge, Catherine II-231 Drăgoi, Cezara II-344

Dreossi, Tommaso I-432 Drews, Samuel I-278

Elfar, Mahmoud I-180 Emmi, Michael II-324, II-534 Enea, Constantin II-267, II-286, II-324, II-534 Ernst, Gidon II-208 Erradi, Mohammed II-267

Farzan, Azadeh I-200 Faymonville, Peter I-421 Fedyukovich, Grigory I-259 Feldman, Yotam M. Y. II-405 Feng, Shenghua I-650 Ferreira, Tiago I-3 Finkbeiner, Bernd I-121, I-241, I-421, I-609 Fränzle, Martin I-650 Fremont, Daniel J. I-432 Frohn, Florian II-426 Furbach, Florian I-355

Gacek, Andrew II-231 Ganesh, Vijay II-367 Gao, Sicun II-137 García Soto, Miriam I-297 Gastin, Paul I-41 Gavrilenko, Natalia I-355 Ghosh, Shromona I-432 Giannarakis, Nick II-305 Giesl, Jürgen II-426 Gomes, Victor B. F. I-387 Griggio, Alberto I-376 Guo, Xiaojie II-496 Gupta, Aarti I-259 Gurfinkel, Arie I-161, II-367

Hasuo, Ichiro I-401, I-520 Heljanko, Keijo I-355 Henzinger, Thomas A. I-297, I-630 Hong, Chih-Duo I-455 Hu, Alan J. II-231 Hu, Qinheping I-335

Huang, Derek A. I-443 Humphrey, Laura R. I-366 Hur, Chung-Kil II-445 Ibeling, Duligur I-443 Iosif, Radu II-43 Jagannathan, Suresh II-459 Jain, Mitesh I-553 Jon á š, Martin II-64 Julian, Kyle I-443 Kahsai, Temesghen II-231 Kang, Eunsuk I-219 Kapinski, James II-137 Katz, Guy I-443 Kim, Edward I-432 Kim, Eric S. I-591 Kincaid, Zachary II-97 Kingston, Derek B. I-366 Klein, Felix I-609 Kochenderfer, Mykel J. I-443 Kocik, Bill II-231 Kölbl, Martin I-79 Kong, Soonho II-137 Könighofer, Bettina I-630 Kotelnikov, Evgenii II-231 Křetínský, Jan I-475, I-497 Kukovec, Jure II-231 Lafortune, St éphane I-219 Lal, Akash II-386 Lange, Julien I-97 Lau, Stella I-387 Lazarus, Christopher I-443 Lazi ć, Marijana II-245 Lee, Juneyoung II-445 Lesourd, Maxime II-496 Leue, Stefan I-79 Li, Jianwen II-3 Li, Yangjia II-187 Lim, Rachel I-443 Lin, Anthony W. I-455 Liu, Junyi II-187 Liu, Mengqi II-496 Liu, Peizun II-386 Liu, Tao II-187 Lopes, Nuno P. II-445 Losa, Giuliano II-245

Madhukar, Kumar I-259 Madsen, Curtis I-540 Magnago, Enrico I-376 Mahajan, Ratul II-305 Majumdar, Rupak I-455 Manolios, Panagiotis I-553 Markey, Nicolas I-22 McLaughlin, Sean II-231 Memarian, Kayvan I-387 Meyer, Roland I-355 Militaru, Alexandru II-344 Millstein, Todd I-315 Monmege, Benjamin I-572 Mukherjee, Sayan I-41 Murray, Toby II-208 Myers, Chris J. I-540 Myreen, Magnus O. II-155

Nagar, Kartik II-459 Neupane, Thakur I-540 Niemetz, Aina II-116 Nori, Aditya I-315 Nötzli, Andres II-23 , II-74

Padhi, Saswat I-315 Padon, Oded II-245 Pajic, Miroslav I-180 Pichon-Pharabod, Jean I-387 Piskac, Ruzica I-609 Ponce-de-Le ón, Hern á n I-355 Prabhu, Sumanth I-259 Pranger, Stefan I-630 Preiner, Mathias II-116

Rabe, Markus N. II-84 Ravanbakhsh, Hadi I-432 Reed, Jason II-231 Reps, Thomas I-335 Reynier, Pierre-Alain I-572 Reynolds, Andrew II-23 , II-74 , II-116 Rieg, Lionel II-496 Roohi, Nima II-137 Roussanaly, Victor I-22 Roveri, Marco I-376 Rozier, Kristin Y. II-3 Rümmer, Philipp I-455 Rungta, Neha II-231

Sagiv, Mooly II-405 Sammartino, Matteo I-3 Sanán, David II-515 Sánchez, César I-121 Sankur, Ocan I-22, I-572 Santolucito, Mark I-609 Schilling, Christian I-297 Schledjewski, Malte I-421 Schwenger, Maximilian I-421 Seshia, Sanjit A. I-432, I-591 Sewell, Peter I-387 Shah, Parth I-443 Shao, Zhong II-496 Sharma, Rahul I-315 Shemer, Ron I-161 Shoham, Sharon I-161, II-245, II-405 Siegel, Stephen F. II-478 Silva, Alexandra I-3 Silverman, Jake II-97 Sizemore, John II-231 Solar-Lezama, Armando II-137 Srinivasan, Preethi II-231 Srivathsan, B. I-41 Stalzer, Mark II-231 Stenger, Marvin I-421 Strejček, Jan II-64 Subotić, Pavle II-231

Tatlock, Zachary II-155 Tentrup, Leander I-121, I-421 Thakoor, Shantanu I-443 Tinelli, Cesare II-23, II-74, II-116 Tizpaz-Niari, Saeid I-140 Tonetta, Stefano I-376 Torfah, Hazem I-241, I-421 Tripakis, Stavros I-219 Trivedi, Ashutosh I-140

Vandikas, Anthony I-200 Vardi, Moshe Y. I-60, II-3 Varming, Carsten II-231 Vazquez-Chanlatte, Marcell I-432 Vediramana Krishnan, Hari Govind II-367 Vizel, Yakir I-161, II-367 Volkova, Anastasia II-174

Waga, Masaki I-520 Wahl, Thomas II-386 Walker, David II-305 Wang, Shuling II-187 Wang, Yu I-180 Weininger, Maximilian I-497 Whaley, Blake II-231 Widder, Josef II-344 Wies, Thomas I-79 Wilcox, James R. II-405 Wu, Haoze I-443

Xu, Xiao II-43 Xue, Bai I-650

Ying, Mingsheng II-187 Ying, Shenggang II-187 Yoshida, Nobuko I-97

Zeleznik, Luka I-297 Zeljić, Aleksandar I-443 Zennou, Rachid II-267 Zhan, Bohua II-187 Zhan, Naijun I-650, II-187 Zhang, Zhen I-540 Zhang, Zhenya I-401 Zhao, Yongwang II-515 Zheng, Hao I-540